Hi, we have a need to create workflow instances representing nodes in a tree like structure, so parent node waits for child workflows to be completed before doing some work(for example aggregating numbers up the hierarchy).
I tested with a simple workflow like below(with no external tasks etc to see pure zeebe performance) with a reasonable zeebe cluster(4 node, 12 partitions) and performance does not look good. i could be doing something wrong, hence checking.
15 levels, each node 2 children
above workflow which does not have any external jobs/tasks took nearly 1h to complete(and the cluster does not have any other activity/workflows going on), here is a screenshot from grafana dashboard.
we also observed that all the workflow instances are handled by the same partition, is it because of parent-child relation between these workflows? if so, it is concerning, is there a way to avoid this?
Correct. The parent-child instances are all on the same partition to process them efficiently (i.e. without communicating to another partition that can be on a different node).
You could distribute the instances to other partitions by using message events or creating multiple process instances.
However, it is not expected that the processing of parent-child processes has a big overhead. Please share your examples to make it easy to reproduce the behavior.
If it doesn’t work out-of-the-box in the forum then I recommend sharing a GitHub repository/gist that contains all required resources (e.g. processes, commands to send, etc.).
start a new workflow instance with payload {“input”: “1”} as shown below(does not require any job workers as there are no jobs in the workflow):
I ran the experiment on Camunda Cloud (Zeebe 1.0.0-rc1, Cluster size: 3, Partition size: 2, Replication size: 3). It took 13 minutes to complete all 65535 process instances.
The numbers sound reasonable to me. I don’t see any bottlenecks. In order to tweak the performance, the execution could be distributed over more partitions (e.g. by using message events).