A DSL for conditional texts in a CMS

For a recent project for a rather large client, we had a very interesting challenge: As part of the effort to rebuild the existing web shop infrastructure into a modern cloud-native system landscape, we also completely revamped the fulfillment system. This includes the generation of delivery documents and invoices.

Because the client is active in a lot of different countries — sometimes even with multiple different subsidiaries — , we decided to use a SaaS CMS to let the stakeholders change the texts of these documents themselves (another aspect why we chose to do it this way is that change requests for 100+ different variations took a lot of afford on the side of the fulfillment team as well as delays for the requests — which we wanted to avoid).
However, the decision to use an external CMS brought some interesting challenges. Namely, in some cases the texts depend on the specifics of the order. To give an example: Some omnichannel orders (specifically Click & Collect and similar) have different processes for returns, that must be reflected on the documents.

The solution we have come up with, is to design a minimal DSL (Domain Specific Language) that allows our stakeholders to specify conditions that determine which texts to print based on the actual order data. These conditions are represented as reusable objects in the CMS so that you don’t need to create multiple objects for the same logical condition. They can only be used with specific fields (for technical reasons) — we call these conditional fields. A conditional field contains a default text as well as an arbitrary number (possibly zero in case that specific shop, or language, doesn’t need any special cases for that field) of special cases that each refer to exactly one condition object.
Originally, we only used these conditional fields for things like the return information. However, they turned out to work so well, that now we use them for any translation that depends on the order data (like for example, payment method or delivery/omnichannel type).

Illustration of the simplified object model in the CMS, showing how different elements are structured to manage and apply conditions for customizing documents in the fulfillment system. — **Figure 1: Simplified object model in the CMS**

Visual representation of a condition object in the CMS, used to define specific rules for customizing delivery documents and invoices based on order data — ***Figure 2: A condition object in the CMS***

The DSL itself is rather simple but designed to be extendable (i.e. more operators can be added later easily — we actually already made use of this property). Another design requirement was to make it as easy as possible to understand for non-technical people. For the same reason, for more complex expressions (using “and” and “or”) we didn’t define an order of precedence, but instead enforce the use of brackets as soon as the expression might be ambiguous without them.
Atomic conditions consist of three components: The field of the order object to compare (possibly even nested fields), the operator to use (at the time of writing there are 3 operators: “is”, “is not” and “starts with”) and a string to compare against.

Example of code defining conditions for customizing delivery documents and invoices in a fulfillment system using a Domain Specific Language (DSL). — ***Figure 3: The original DSL grammar***

Here are some examples of conditional expressions:

deliveryType is "click&collect"

(paymentType is "paypal" or paymentType is "creditcard") and documentType is "creditNote"

The calling system, that fetches the content from the CMS and forwards it to the PDF generation, contains the parser for the DSL and transforms the texts according to what conditions match the incoming order object.
This parser is implemented directly in code (Java) since at the time it seemed overkill to use a parser generator like ANTLR for such a simple DSL. We build a recursive-decent parser with backtracking but without lookahead. The rationale was that it’s easy to build, and since both the grammar and the potential inputs are quite small, the exponential asymptotic runtime basically doesn’t matter.
The most complicated part to implement was the backtracking mechanism. However, since there are basically just three different cases (namely: token rule, alternative rule and sequence rule) to take care of, we were able to abstract the backtracking completely out of the actual parsing logic.
Another interesting detail is that because we have keywords with spaces, we were unable to utilize the standard Java Scanner class for lexical analysis. Instead, we had to build our own that only uses patterns to generate the next token. This is because of how the grammar was designed. We would have been able to get rid of this custom solution if we had divided the operator instances containing whitespaces into sequences of tokens (e.g. “is” “not” instead of “is not”). At the time, having only one token per operator seemed like the less complex solution. However, in retrospect, it would have probably been better to move the complexity to the grammar and get rid of that custom scanner component.

This system has been in service for the past few months, and thus far it has been working without any major issues. As mentioned earlier, we also already made use of the extensibility property of our grammar to add the new “starts with” operator — which literally only took copying the respective token class, adjust its pattern and adding the AST logic.

The reception by the stakeholders is generally very good. Although, they are currently only using preexisting condition objects, as there has not yet been a case, when the existing ones are not comprehensive enough to model a specific use case.