Zeebe multi-region active-passive setup

oleg.efimov · April 24, 2024, 7:22pm

Multi-region setup describes Zeebe in active-active mode for overall active-passive setup.

This creates some issues for our overall active-passive setup:

Faults in passive region are affecting active region - e.g. network hiccups & node crashes can delay or abort consensus (= overall request processing in that partition).
Cross-region network calls are adding to request processing latency. This will become visible on higher percentiles and bigger cluster sizes due to combinatorial multiplication of chances for a single node slow response.
Chatty cross-region network traffic of consensus is more expensive. Ideally minimal one-way replication traffic from active to passive region is what we want to pay for.

Is there existing solution to these problems, especially to p1?
And to make it more concrete, have you considered truly active-passive for Zeebe, e.g. when special stream processor replicates changes to passive region? You may even configure it to by async (non-zero RPO, low region coupling) or sync (vice versa).

Thanks!

Also posted in Stack Overflow.

.