Will Zeebe v0.15.0 release ready for Production with a single node broker for limited use cases?

vermaan · December 30, 2018, 9:39pm

I know this sounds like a bad idea , but for a limited use-case do you think a single node Zeebe broker is considered stable to be used in production?

mike.winters · January 2, 2019, 9:33am

Hi @vermaan, I know @menski commented on this in the Slack channel already–I’ll post a response here too just because the forum is a bit more visible for folks who have the same question later.

• We won’t be putting a “production-ready” label on the 0.15 release, even for single-node deployments
• That being said, like Sebastian mentioned, you could still give it a try and let us know if you run into any problems, as it is true that much of the stability work we’ll be doing relates to running Zeebe in a cluster / with K8S.
• But Running Zeebe in a cluster with multiple nodes is what will make it fault tolerant (we touch on this briefly here: https://docs.zeebe.io/basics/partitions.html), and we haven’t done any substantial testing of Zeebe on a single node along with some sort of persistent storage such as e.g. EBS (in the case of an EC2 deployment) attached to an instance–this isn’t an “officially supported” scenario. So, in addition to scalability, your fault tolerance requirements are the important consideration here.

Hope that helps!

Best,
Mike

vermaan · January 2, 2019, 10:38am

Hi Mike,

I am slightly unclear re: partitioning in Zeebe. In Kafka, partitioning allows you to achieve higher degree of parallelism on a topic.

In context of Zeebe, is partitioning used to make your cluster more fault tolerant or does it allow for running more number of workflows concurrently? My understanding that it is former.

Thanks,
Anuraag

vermaan · January 2, 2019, 11:00am

@wints - Also, I am curios to understand how does Zeebe does leader election and guarantees consistencies of writes to a partition between leader and its replicas. Can you point me to a resource where this is explained?

mike.winters · January 2, 2019, 11:01am

Hi Anuraag, apologies–I was referring to the “replication” section of that page in the docs but I didn’t make that clear (and replications / replication factor / fault tolerance in general is a subject that we should document in more detail).

It sounds like your understanding is correct: partitioning is what makes it possible to distribute workflow instances across a cluster and therefore run more workflows concurrently. The replicationFactor defines the number of replications of each partition, and it’s these replications (when running on a cluster with multiple nodes) that make Zeebe fault tolerant. This is best described in the docs here, albeit not in a ton of detail: https://docs.zeebe.io/basics/clustering.html#raft-consensus-and-replication-protocol. This, too, is conceptually similar to Kafka.

replicationFactor is set in the Zeebe config file: https://docs.zeebe.io/operations/the-zeebecfgtoml-file.html

Let me know if that helps!

Best,
Mike

vermaan · January 2, 2019, 11:04am

Thanks. I will give it a read.

mike.winters · January 2, 2019, 11:04am

Hi Anuraag, your second question came in while I was still working on my response so I missed it. I think the “Clustering” page in general would be helpful for you: https://docs.zeebe.io/basics/clustering.html. Let me know if there are concepts you think are missing here. Always good to know where we can improve the docs.

Best,
Mike

system · January 31, 2024, 10:07am