Large DMN Tables (Many Rules and Many Inputs/Outputs)

StephenOTT · June 27, 2016, 5:40pm

Looking at DMN and working through some example use cases.

Curious to see if anyone else is working with large or “complex” dmn tables.
Building them and maintaining them in modeler or something like excel?

Example fileDogLicensesPrice.dmn (23.8 KB)

comparing with excel version:

The key visual difference is the “grouping” of similar rule rows, so you can easily see when the change occurs.

Thoughts?

nvanbelle · June 28, 2016, 9:07am

Looking at the XML representation of your example, I don’t think it is very hard to generate it from Excel (VBA) or another small application.

BerndRuecker · July 4, 2016, 8:25am

We have a small example how to transform Excel to DMN: https://blog.camunda.org/post/2016/01/excel-dmn-conversion/. You can use that code as a basis to develop your own conversion, we successfully did that in real-life projects - works well.

The grouping needs some extension to that conversion though.

Cheers
Bernd

mppfor_manu · January 29, 2017, 9:11pm

We are about to consider this same question. My concern with Excel is that it is, natively, freeform. I would prefer something that “constrains” the user’s actions. To be sure, you could write VBscript within Excel to create a more form-like tool, but it’s still a bit of a kludge.

The Excel to DMN converter does work, and it’s pretty quick. However, it’s very limited in what it can support. OpenL Tablets, another DMN implementation, uses Excel as its input source and might support a richer set of features.

Camunda Modeler, while an improvement in many areas over the use of Excel still lacks critical features for large tables and enterprise management of them. That’s where we think we’re going to need to develop our own tools.

One final thing, I created a 1000 rule (3 inputs, 2 outputs) table in Excel, converted it into DMN, then attempted to edit it with Camunda Modeler. It essentially choked on the table, though if you were extremely patient (wait several minutes or more per action), you might be able to make minor changes. The Camunda BPMN Cockpit GUI similarly behaved very slowly. It’s interesting to note that benchmarks for the DMN engine only use tables of up to 100 rules and 1 or 2 inputs.

StephenOTT · January 29, 2017, 9:19pm

@mppfor_manu what type of rules are you storing? We looked at a few different scenarios, and something we found was that in many cases where we have these scenarios of “1000s” of rules, a DMN would not be the best option (or anything similar). A better example was to build a micro service that computed the logic and returned as a web service. We found this to be the case for the pattern of “1000s” of rules, because to manage that number means you are likely dealing with lots of minor variance and there is likely a large amount of knowledge the user needs to have about the data to make the correct changes and understand the impacts.
Building the micro service allowed us to create other UIs for management of the rules, additional classes and functions that check for rule specific errors, etc.

mppfor_manu · January 29, 2017, 9:41pm

This is a common issue and one which we’re dealing with right now. We’re relatively new to Camunda/DMN, but not to workflows. We need to figure out a general rules for the encoding of business logic. Right now, the two primary choices are Camunda BPMN or Camunda DMN. Management of those methods is driven by current tooling (Camunda Modeler), which we generally only use for DMN. Nonetheless, the tools available are only marginally adequate when a large number of rules are involved.

The problem I see it is how much normalization do you apply to your rule sets. With the addition of DRD support in Camunda 7.6, you can have a number of smaller tables that are formally linked within a single DMN endpoint. I cannot comment on the relative performance of large single tables versus a DRD “tree” of smaller tables. However, from a management and “comprehensibility” standpoint, the DRD is easier to use as the logic is a bit more obvious and the individual tables are smaller.

Another challenge is understanding the logic of an entire workflow combining both BPMN and DMN. To solve this, I believe we’re going to modify an existing tool or use external tooling to provide diagrams that include the BPMN elements and decisions. I’m not exactly sure how this would look, but in theory such a system, were it “bidirectional” (it can both display current process definitions and modify their native definitions) would be better as users can more easily see the decisions in context of the overall process.

Another idea would be to use a document store database like MongoDB to act as the DMN engine. This would require building an abstraction layer to mimic DMN engine functionality as well as provide an acceptable interface. The advantage of this would that you could leverage any database tools for managing the DMN tables.

We are a very large telecommunications company trying to collapse legacy automation and management system into a more cohesive framework, which is why we’re using Camunda and DMN. I constantly tell everyone that nothing about any of this is simple. I think DMN may be the answer to a lot of needs, but I’m still trying figure out if it is fast enough to support our requirements for complex tables.

StephenOTT · January 30, 2017, 1:48am

@mppfor_manu can you give a rule examples that you are trying to use?

thorben · January 30, 2017, 9:25am

It’s great to hear that you have used the converter. If you would like to see it improve, please raise issues in the converter’s github project. This is a side project of mine besides my regular Camunda activities and since I don’t use it in my daily work I decided to only improve it based on user feedback (if any). You could also contribute code if you like.

mppfor_manu · January 30, 2017, 1:40pm

Firstly, I should explain that the vast majority of our work is automating “machine” processes. While Camunda is very flexible, my perception is that the emphasis is on human driven tasks, albeit with a significant portion of that involving some machine-to-machine activity. Our work is attempting to take people out of the process so that we provide more consistent, reliable, and faster resolution of problems. We only want people involved where encoded logic cannot reliably drive the required activities (i.e. fallout management).

We have “workflows” that might contain 10, 20, or 30 steps (activities). At many of these steps, there may be anywhere from 2 to over 100 different “paths” (decisions) that cause it to branch off to a different set of activities based upon current event attributes (process variables). One challenge is deciding when to use BPMN versus DMN versus hand coded (i.e. Java classes) encoding* of workflow logic.

One current system has two attributes that drive workflows. There are approximately 100 different values for each attribute, which means there are as many as 10,000 different paths that can be taken at the first major decision point. After that point, sub workflows may also require evaluation of a large number of attributes (based upon other values passed into the workflow or gathered during its execution). Nonetheless, each “decision point tooling” would be considered using the same guidelines.

For example, we might say that if you come to an activity that provides for multiple execution paths, of which you must choose one, then if the number of possible paths exceeds five (5), you should use a DMN table. Otherwise, you should use BPMN. The reason one use this guideline is that more than five sequence flows off of an exclusive gateway becomes “messy” and diagrams may become too large to manage. Camunda would not care, but again this comes down management of the decisions by people, not machines.

I have decided that the best approach to creating these guidelines is to consider ease of creation and maintenance of the logic for that activity. In the end, we spend little (actually none) time executing workflows. Camunda does that for us. Most of time is spent creating, managing, and analyzing workflows. Therefore, the choice(s) of tool used to encode workflow logic is the most critical decision.

As we are speaking of tools, I poked around the converter code and, to my limited Java experience, seemed relatively straightforward. I think my colleagues who are real Java programmers would be able to modify it to suit a broader range of needs.

One thing it would be good for is “one-off” conversion of rules from other systems. For example, we currently have large database tables which contain rules that accessed by returning rows based upon an SQL query. If the logic of this query were “uniary” (which I think means “does this match”) for several key columns, then translating the table to DMN through the converter would be very easy. More complex logic, again the result of an SQL query, could also be encoded by extending the converter code.

As I mentioned in another post, OpenL Tablets’ implementation of DMN does provide a slightly more sophisticated conversion mechanism, but also lacks the constraints that a dedicated, albeit more general use, DMN tool would impose upon the user.

Congratulations on reaching the end of this long post! Reward yourself with one adult beverage!

In this context, encoding means how you store logic, not necessarily the “language” used to so (I acknowledge that DMN is encoded in XML, but there’s actually nothing preventing you from abstracting the logic it applies to reach decisions in a database for example.).