I’m planning on using DMN is a big way for applications that will be making very high volume requests that must return decisions very quickly. My initial benchmarks indicate that on average it takes an average of 100 milliseconds for a single DMN instance running in the Camunda BPMN engine and accessed through the REST API to provide a decision from a relatively small (10 rules, 2 inputs, 1 output). Perhaps the use of the REST API is the limiting factor.
I’m confused by some of the information in the Blogs regarding DMN engine performance. The bar graph at the URL following (https://blog.camunda.org/post/2015/12/dmn-benchmark/) seems to imply performance exceeding 200,000 decisions (is that what an “evaluation” is?) per second, while the table on the same page shows maximum performance of 220 decisions per second for the simplest tables.
The latter number seems more realistic based upon my benchmarks, but I would like to know what to actually expect as we plan our use. I realize that performance can vary based upon a number factors. Our servers are more powerful than the one used in the benchmark. Our tables can vary in size greatly
What’s also not clear is how you “turn off” features that impact performance. The DMN documentation doesn’t say how you can dynamically shut off features for certain DMN calls. Do you have to turn off history and repository for all of Camunda to do this? If so, it would imply that if you’re looking for the absolutely best performance for DMN and you have to preserve history (which I can’t imagine you would not), then you must run a separate instance of Camunda just for DMN, which means that it must be accessed strictly through the REST interface.
So, what is the benchmark number?
Can you spin up more DMN engines that evaluate the same set of tables and use a load balancer (or equivalent) to spread requests over them?
Is is better to create standalone instances of the DMN engine as described on Github rather than use the engine embedded with the Camunda BPMN engine? If yes, what deployment mechanisms may be used?
the performance of the evaluation of a DMN decision table is influenced by many factors (e.g. size of the table, kind of expressions, expression language, history on/off, how you invoke the table, etc.). If the performance is most important for your use case then you should switch off the history (or use your own history backend) and don’t use the deployment repository - like described in the blog post.
Regarding your other questions:
Of course., but you have to build this by yourself.
Yes, if you want the best performance.
You can just handle the deployment by yourself. For example, load the DMN decision tables from DB or file system and cache them. The deployment repository is slower because it needs to check if the versioning of the definition.
Note that you can run the DMN benchmark with your DMN decision tables in your environment. Check the Github repository for details.
Does this help you?
Perhaps I wasn’t clear.
How do you turn off history for DMN only?
If you do turn off “history”, do you lose history for everything Camunda does? We must have process history.
I apologize if I’ve missed obvious documentation of the procedure. I do actually read the manuals, but sometimes don’t understand what you must do to implement something.
as far as I understand your use case, the performance of the DMN evaluation is most important for you. To reach this goal, you should use a standalone DMN engine which has no history by default.
But if you use the DMN engine inside the process engine (i.e. via decision service or business rule task) then you can disable the history for DMN by choosing a history level < FULL or implement your custom history level.
Does this help you?
Yes, that’s the type of clear answer I was looking for and is very helpful. We’ll focus on the standalone DMN engine (which we have running).
If it’s not too much trouble, could you tell me if the standalone engine supports a deployment method like the embedded (in the Camunda BPM distribution) where you can push a DMN/DRD table using a REST interface, or must we write that ourselves?
the standalone DMN engine has no repository. It just parse a DMN file from a given input stream, file or model instance.
So if you need a repository then you have to implement this by your own.
Note that it also has no REST Api that can be called to evaluate a decision from outside.
You can still deploy the decision along with processes into Camunda. Just don’t use
DecisionService to execute them, but fetch the XML via
Repository#getDecisionModel and parse/execute it with a standalone decision engine (i.e. not the one that comes with the process engine). In order to avoid hitting the database with every execution request, you could build some caching for the parsed models.
The question we’re going to have to answer is, is it worth the effort to build all the deployment management tooling for a standalone instance, or should we simply run a separate instance of Camunda, with all the performance degrading options turned off, and simply use it for DMN processing only?
Would there be a substantial difference in performance between using DMN within a Camunda instance where it is the only component being used and no history or performance degrading functions are enabled? Or, does a standalone instance (which must still be run in a Java container which has fixed overhead) over an order of magnitude of performance because it is not “bound” to a Camunda BPM engine instance?
the easiest way is to run a Camunda instance with a separated DMN engine, like Thorben suggested. So you can use the existing Rest Api to deploy new definitions. However, if you want to evaluate a decision via Rest then you have to build it by yourself to use the cached decision instead of loading it from the database.
Or, you can build the deployment management by yourself. You may end up in solution with less overhead. The difference in performance should be not much.
Does this help you?
I’m not completely clear on your answer.
When you say a “separated” DMN engine, do you mean at free-standing (embedded) instance?
What I was suggesting is just running a Camunda BPMN server, but only use it for DMN. That way you get easy deployment and a central repository.
It sounds like you are saying that if we use it like that and shut off all history, etc., then the performance between it and a “custom” solution won’t be that much.
you can’t use the embedded DMN engine of Camunda instance because it would get the decision definition from the database. In order to avoid this, you could create an additional standalone DMN engine, like Thorben explained.
I didn’t explain myself very well and used incorrect terminology. “Embedded” as you rightly point out, refers to the DMN engine contained within the Camunda BPMN distribution we use to process workflows. “Standalone” would more properly refer to a DMN instance separate from any BPMN engine components (to the degree possible).
We’re just trying to figure the best way to get performance and manageability. Camunda does not provide any enterprise management tools for large, distributed DMN usage, so we will be building our own. The original question was intended to determine which of the 3 different DMN usage models is best.
We are going to use a separate, full Camunda instance, but turn off everything that would slow DMN throughput (e.g. history). These DMN instances would be used for situations where we did not need to associate a decision with a specific request. For example, we plan to use it to provide configuration data to applications rather than using static, local properties files.
The advantage of this is obvious. Example, you build workflows on your local Windows PC that refer to files in a location different than that which might be found on a Linux workstation. Your code simply needs to determine its operating system, send that to the “standalone” DMN instance as an input, and it will return the file location appropriate to that operating system.
The concern I have about performance is illustrated here. If you have an application or workflow that constantly requires such information, then DMN may become the throughput limiter. The benchmarks shown in Bernd’s blogs suggest performance at a level I can’t understand, nor could I even replicate as I have no idea how you generated over 600,000 decision requests per second without specialized hardware and software. But I accept that you believe DMN can be really fast, which is what I’m counting on.
DMN as implemented within a Camunda BPMN instance offers a great deal of functionality. We have chosen to utilize it on much broader basis and hope it can do the job.
As always, I’m grateful for Camunda’s prompt, helpful responses. You folks are great!
Hey Michael - did you ever found the best compromise ?
We shut down all of Camunda’s native history recording (i.e. history-level = none). This reduces the overhead that imposes for high-volume transactions.
We found that you can just use the regular DMN engine included in the Camunda distribution, although we built a proof-of-concept standalone implementation of the DMN engine, but didn’t pursue it because of the challenges of maintaining DMN tables without the Camunda distribution tooling.
I was told there was no advantage to running a separate Camunda instance exclusively for DMN calls on the same server as the Camunda instance accepting process calls.
In short, just use what our friends at Camunda have built for us as it seems to be the best solution when considering all aspects of its use and management. It is very fast out of the box and we hit it with a very high volume of requests and it seems to work fine.