Zeebe use cases and validation of our architectural approach

mlehmacher · March 31, 2020, 7:08pm

Hello everyone,

we are currently in the process of evaluating Zeebe for integration into our Microservice Platform. For our saas business model we are in the process of building a platform which consists of “primitive” business processes which can be wired together into client specific business flows. We want to deliver individually composed, scalable products for our clients on our platform. According to the marketing blurb Zeebe is an exact fit for our solution requirements.

However, we have some questions where we hope that the answers will increase our confidence

Circumstances and basic solution idea: We have specific platform modules (i.e. microservices) which offer “primitive” business processes. Primitive in the sense, that they have to be composed into higher order processes in order to deliver value to platform clients. That’s where Zeebe comes into play. We want to introduce Zeebe to our orchestration layer in order to bring together (wire into a sequence) different primitive processes into a valuable whole. Right now those primitive processes are implemented using either Camunda or hand crafted components. Going forward we want to take a uniform approach to implementing processes and are considering using Zeebe both in the layer of primitive and orchestration processes. With respect to the primitive process layer Zeebe will degenerate to always delegate task execution of primitive processes back into the service owning the process definition. Basically we see the need for process composition in different layers. Also we think the BPMN expressivity supported by Zeebe is good enough for us. Do you endorse this approach? Is our understanding of process composition in terms of having an orchestrating process where each task possibly represents a whole process taking place in another service valid?
Regarding process versioning and in-place migrations. As far as I understand Zeebe offers versioning of process definitions. Is there any systematic support for migrating a process instance in one version to another version? In your experience, is that something wich is needed at all? We may have long running processes and I imagine we don’t want to keep task execution code for multiple versions of process definitions around.
The docs are a little sparse on error semantics. What happens in the event of task workers not being able to finish their task? From what I understand after a specific timeout another task worker will receive the task. What if that one also fails? Is there any way to report failures back to Zeebe? Is that something we will have to model “in-band”? What would be really helpful in terms of getting to understand the behavior would be a docker-compose example application which illustrates certain scenarios. Do you know of such an example?
On https://zeebe.io/ you write “Zeebe workflows can react to messages published to Apache Kafka and more”. However, I cannot find anything specific to that in the docs. Do you refer to GitHub - camunda-community-hub/kafka-connect-zeebe: Kafka Connector for Zeebe.io with that? Does integration of Zeebe workflows and messaging solutions yield value which is not otherwise provided by vanilla Zeebe?

Thank you everyone for your answers in advance!

salaboy · April 2, 2020, 9:51am

@mlehmacher hi there… some answers to your questions:

totally yes… that is a common scenario for tools like Zeebe or Camunda
it really depends on your business requirements, Zeebe does version the flows and you can decide to keep old versions around until they are finished and create new instances only with new versions.
Have you seen Operate? If a worker fail fore more than the default retry times, it will create an incident which you can see in Operate and then retry after fixing the error. I am a Kubernetes guy, so I recommend you to check http://helm.zeebe.io to check our Helm Charts, where you can deploy a Zeebe Cluster with Operate.
Kafka integration is just an example of what can be done, as many people doing microservices will use Kafka messaging to communicate different systems, Zeebe can tap into Kafka messages and move process forwards if that is needed. If you are not looking into using Kafka, you shouldn’t get confused with that. If you are looking into Kafka you can keep that as an alternative of interacting with the workflows using a client API, as you can also do that with Kafka message.

I hope this helps to clarify some concerns. We will be more than happy to help if you decide to use Zeebe for your project with other questions that might arise in your journey…

Cheers

jwulf · April 2, 2020, 5:45pm

There is no migration of process instances. See this issue.

You can version your task types to have workers that service different versions. Have a look at the flowing-retail project for an example.

There are two types of failure. Exception (like a database not being there), including an unhandled runtime exception in the worker code - which is a failure - and a business error.

In the case of an exception, the task worker can catch something exceptional and explicitly send the fail command to the broker with a failure message, optionally decrementing the retry count, or the worker library code may catch an unhandled exception and send a fail command (which in at least the case of the Node.js library, decrements the retry count).

In this case, if there are any retries left, the broker will immediately make the job available for another worker to request.

If there are no retries, then the broker will raise an incident for this process instance.

If the task worker neither completes nor fails the job within the amount of time that it specified as the timeout when it activated the job, the broker will make the job available for activation by another worker, without decrementing the retry count.

In the case of a business error, the task worker can specify the error code and trigger a specific error flow in the process, but at the moment cannot update the job variables at the same time. See this issue.