EDIT: I think this has something to do with replication=2. It has no benefit over replication=1 and a leader has no quorum anymore if a partition goes down.
I wanted to showcase the Zeebe architecture in a Uni lecture. So I started a local zeebe cluster based on the (deprecated) docker-compose file. 1 gateway, 4 nodes. Everything starts just fine, partitions are created and replicated etc.
Now I pause one of the nodes with a lead partition. My expectation was, that a follower is not able to contact the leader and after a certain amount of time, a leader election will be triggered. But nothing happens. The leader is offline and the follower just logs the inability to contact it. I use the latest image 8.5.0 for zeebe but it also happens in previous version.
Additionally, the lead partition 1 is reassigned as follower so I am not even able to deploy a process.
What am I getting wrong? Any thoughts on that?
Thanks
Stefan
Broker 0 Role Change of Partition 1 to FOLLOWER Log
2024-05-13 07:56:30.418 [] [atomix-cluster-events] [] DEBUG
2024-05-13T07:56:30.418821096Z io.camunda.zeebe.broker.client.impl.BrokerTopologyManagerImpl - Received metadata change from Broker 0, partitions {1=LEADER}, terms {1=2} and health {1=HEALTHY}.
2024-05-13T07:56:48.489679048Z 2024-05-13 07:56:48.488 [] [atomix-cluster-heartbeat-sender] [] INFO
2024-05-13T07:56:48.489727882Z io.atomix.cluster.protocol.swim - 0 - Member unreachable Member{id=1, address=172.20.0.4:26502, properties={brokerInfo=EADJAAAABAABAAAAAwAAAAQAAAACAAAAAAABCgAAAGNvbW1hbmRBcGkQAAAAMTcyLjIwLjAuNDoyNjUwMQUAAgEAAAABAgAAAAAMAAECAAAAAQAAAAAAAAAFAAAAOC41LjAFAAIBAAAAAAIAAAAA, event-service-topics-subscribed=KIIDAGpvYnNBdmFpbGFibOU=}}
2024-05-13T07:56:48.490034173Z 2024-05-13 07:56:48.489 [] [atomix-cluster-events] [] DEBUG
2024-05-13T07:56:48.490049923Z io.camunda.zeebe.broker.client.impl.BrokerTopologyManagerImpl - Received REACHABILITY_CHANGED for broker 1, do nothing.
2024-05-13T07:56:48.490390132Z 2024-05-13 07:56:48.490 [Broker-0] [zb-actors-1] [TopologyManager] DEBUG
2024-05-13T07:56:48.490402507Z io.camunda.zeebe.broker.clustering - Received REACHABILITY_CHANGED from member 1, was not handled.
2024-05-13T07:56:49.464627591Z 2024-05-13 07:56:49.462 [] [atomix-cluster-heartbeat-sender] [] WARN
2024-05-13T07:56:49.464748924Z io.atomix.cluster.protocol.swim.probe - 0 - Failed to probe 1
2024-05-13T07:56:49.464758216Z java.util.concurrent.TimeoutException: Request atomix-membership-probe to 172.20.0.4:26502 timed out in PT0.1S
2024-05-13T07:56:49.464760674Z at io.atomix.cluster.messaging.impl.NettyMessagingService.lambda$sendAndReceive$4(NettyMessagingService.java:261) ~[zeebe-atomix-cluster-8.5.0.jar:8.5.0]
2024-05-13T07:56:49.464766799Z at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) ~[?:?]
2024-05-13T07:56:49.464768757Z at java.base/java.util.concurrent.FutureTask.run(Unknown Source) ~[?:?]
2024-05-13T07:56:49.464771216Z at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) ~[?:?]
2024-05-13T07:56:49.464773299Z at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:?]
2024-05-13T07:56:49.464775882Z at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:?]
2024-05-13T07:56:49.464777757Z at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[netty-common-4.1.108.Final.jar:4.1.108.Final]
2024-05-13T07:56:49.464779632Z at java.base/java.lang.Thread.run(Unknown Source) [?:?]
2024-05-13T07:56:49.572874757Z 2024-05-13 07:56:49.572 [] [atomix-cluster-heartbeat-sender] [] INFO
2024-05-13T07:56:49.572923757Z io.atomix.cluster.protocol.swim.probe - 0 - Failed all probes of Member{id=1, address=172.20.0.4:26502, properties={brokerInfo=EADJAAAABAABAAAAAwAAAAQAAAACAAAAAAABCgAAAGNvbW1hbmRBcGkQAAAAMTcyLjIwLjAuNDoyNjUwMQUAAgEAAAABAgAAAAAMAAECAAAAAQAAAAAAAAAFAAAAOC41LjAFAAIBAAAAAAIAAAAA, event-service-topics-subscribed=KIIDAGpvYnNBdmFpbGFibOU=}}. Marking as suspect.
2024-05-13T07:56:49.737642632Z 2024-05-13 07:56:49.736 [] [atomix-cluster-heartbeat-sender] [] WARN
2024-05-13T07:56:49.737695674Z io.atomix.cluster.protocol.swim.sync - 0 - Failed to synchronize membership with Member{id=1, address=172.20.0.4:26502, properties={brokerInfo=EADJAAAABAABAAAAAwAAAAQAAAACAAAAAAABCgAAAGNvbW1hbmRBcGkQAAAAMTcyLjIwLjAuNDoyNjUwMQUAAgEAAAABAgAAAAAMAAECAAAAAQAAAAAAAAAFAAAAOC41LjAFAAIBAAAAAAIAAAAA, event-service-topics-subscribed=KIIDAGpvYnNBdmFpbGFibOU=}, version=8.5.0, timestamp=1715586489310, state=SUSPECT, incarnationNumber=1715586489326}
2024-05-13T07:56:49.737700549Z java.util.concurrent.TimeoutException: Request atomix-membership-sync to 172.20.0.4:26502 timed out in PT0.1S
2024-05-13T07:56:49.737702674Z at io.atomix.cluster.messaging.impl.NettyMessagingService.lambda$sendAndReceive$4(NettyMessagingService.java:261) ~[zeebe-atomix-cluster-8.5.0.jar:8.5.0]
2024-05-13T07:56:49.737705049Z at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) ~[?:?]
2024-05-13T07:56:49.737706632Z at java.base/java.util.concurrent.FutureTask.run(Unknown Source) ~[?:?]
2024-05-13T07:56:49.737708091Z at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) ~[?:?]
2024-05-13T07:56:49.737709882Z at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:?]
2024-05-13T07:56:49.737790549Z at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:?]
2024-05-13T07:56:49.737793674Z at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[netty-common-4.1.108.Final.jar:4.1.108.Final]
2024-05-13T07:56:49.737795299Z at java.base/java.lang.Thread.run(Unknown Source) [?:?]
2024-05-13T07:56:50.437103383Z 2024-05-13 07:56:50.435 [Broker-0] [raft-server-0-1] [raft-server-1] WARN
2024-05-13T07:56:50.437156591Z io.atomix.raft.roles.LeaderAppender - RaftServer{raft-partition-partition-1} - VersionedAppendRequest{version=2, term=2, leader=0, prevLogIndex=2, prevLogTerm=2, entries=0, commitIndex=2} to 1 failed
2024-05-13T07:56:50.437160299Z java.util.concurrent.CompletionException: java.util.concurrent.TimeoutException: Request raft-partition-partition-1-append-versioned to 172.20.0.4:26502 timed out in PT2.5S
2024-05-13T07:56:50.437163174Z at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(Unknown Source) ~[?:?]
2024-05-13T07:56:50.437168466Z at java.base/java.util.concurrent.CompletableFuture.completeThrowable(Unknown Source) ~[?:?]
2024-05-13T07:56:50.437170133Z at java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(Unknown Source) ~[?:?]
2024-05-13T07:56:50.437171591Z at java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown Source) ~[?:?]
2024-05-13T07:56:50.437172924Z at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(Unknown Source) ~[?:?]
2024-05-13T07:56:50.437174299Z at io.atomix.cluster.messaging.impl.NettyMessagingService.lambda$sendAndReceive$4(NettyMessagingService.java:259) ~[zeebe-atomix-cluster-8.5.0.jar:8.5.0]
2024-05-13T07:56:50.437175799Z at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) ~[?:?]
2024-05-13T07:56:50.437177216Z at java.base/java.util.concurrent.FutureTask.run(Unknown Source) ~[?:?]
2024-05-13T07:56:50.437178549Z at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) ~[?:?]
2024-05-13T07:56:50.437180008Z at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:?]
2024-05-13T07:56:50.437181341Z at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:?]
2024-05-13T07:56:50.437182716Z at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[netty-common-4.1.108.Final.jar:4.1.108.Final]
2024-05-13T07:56:50.437184424Z at java.base/java.lang.Thread.run(Unknown Source) [?:?]
2024-05-13T07:56:50.437185758Z Caused by: java.util.concurrent.TimeoutException: Request raft-partition-partition-1-append-versioned to 172.20.0.4:26502 timed out in PT2.5S
2024-05-13T07:56:50.437187758Z at io.atomix.cluster.messaging.impl.NettyMessagingService.lambda$sendAndReceive$4(NettyMessagingService.java:261) ~[zeebe-atomix-cluster-8.5.0.jar:8.5.0]
2024-05-13T07:56:50.437191383Z ... 7 more
2024-05-13T07:56:51.090425591Z 2024-05-13 07:56:51.089 [] [atomix-cluster-heartbeat-sender] [] INFO
2024-05-13T07:56:51.090467966Z io.atomix.cluster.protocol.swim.probe - 0 - Failed to probe 1
2024-05-13T07:56:53.037316551Z 2024-05-13 07:56:53.036 [Broker-0] [raft-server-0-1] [raft-server-1] WARN
2024-05-13T07:56:53.038672301Z io.atomix.raft.roles.LeaderAppender - RaftServer{raft-partition-partition-1} - VersionedAppendRequest{version=2, term=2, leader=0, prevLogIndex=2, prevLogTerm=2, entries=0, commitIndex=2} to 1 failed
2024-05-13T07:56:53.038687051Z java.util.concurrent.CompletionException: java.util.concurrent.TimeoutException: Request raft-partition-partition-1-append-versioned to 172.20.0.4:26502 timed out in PT2.5S
2024-05-13T07:56:53.038689134Z at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(Unknown Source) ~[?:?]
2024-05-13T07:56:53.038691842Z at java.base/java.util.concurrent.CompletableFuture.completeThrowable(Unknown Source) ~[?:?]
2024-05-13T07:56:53.038693509Z at java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(Unknown Source) ~[?:?]
2024-05-13T07:56:53.038694842Z at java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown Source) ~[?:?]
2024-05-13T07:56:53.038696134Z at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(Unknown Source) ~[?:?]
2024-05-13T07:56:53.038697426Z at io.atomix.cluster.messaging.impl.NettyMessagingService.lambda$sendAndReceive$4(NettyMessagingService.java:259) ~[zeebe-atomix-cluster-8.5.0.jar:8.5.0]
2024-05-13T07:56:53.038698842Z at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) ~[?:?]
2024-05-13T07:56:53.038700051Z at java.base/java.util.concurrent.FutureTask.run(Unknown Source) ~[?:?]
2024-05-13T07:56:53.038701301Z at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) ~[?:?]
2024-05-13T07:56:53.038702676Z at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:?]
2024-05-13T07:56:53.038703926Z at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:?]
2024-05-13T07:56:53.038705176Z at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[netty-common-4.1.108.Final.jar:4.1.108.Final]
2024-05-13T07:56:53.038706551Z at java.base/java.lang.Thread.run(Unknown Source) [?:?]
2024-05-13T07:56:53.038707759Z Caused by: java.util.concurrent.TimeoutException: Request raft-partition-partition-1-append-versioned to 172.20.0.4:26502 timed out in PT2.5S
2024-05-13T07:56:53.038709134Z at io.atomix.cluster.messaging.impl.NettyMessagingService.lambda$sendAndReceive$4(NettyMessagingService.java:261) ~[zeebe-atomix-cluster-8.5.0.jar:8.5.0]
2024-05-13T07:56:53.038710509Z ... 7 more
2024-05-13T07:56:55.639714885Z 2024-05-13 07:56:55.637 [Broker-0] [raft-server-0-1] [raft-server-1] WARN
2024-05-13T07:56:55.639774510Z io.atomix.raft.roles.LeaderAppender - RaftServer{raft-partition-partition-1} - VersionedAppendRequest{version=2, term=2, leader=0, prevLogIndex=2, prevLogTerm=2, entries=0, commitIndex=2} to 1 failed
2024-05-13T07:56:55.639785510Z java.util.concurrent.CompletionException: java.util.concurrent.TimeoutException: Request raft-partition-partition-1-append-versioned to 172.20.0.4:26502 timed out in PT2.5S
2024-05-13T07:56:55.639860552Z at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(Unknown Source) ~[?:?]
2024-05-13T07:56:55.639871969Z at java.base/java.util.concurrent.CompletableFuture.completeThrowable(Unknown Source) ~[?:?]
2024-05-13T07:56:55.639876802Z at java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(Unknown Source) ~[?:?]
2024-05-13T07:56:55.639890635Z at java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown Source) ~[?:?]
2024-05-13T07:56:55.639895594Z at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(Unknown Source) ~[?:?]
2024-05-13T07:56:55.639911260Z at io.atomix.cluster.messaging.impl.NettyMessagingService.lambda$sendAndReceive$4(NettyMessagingService.java:259) ~[zeebe-atomix-cluster-8.5.0.jar:8.5.0]
2024-05-13T07:56:55.639918969Z at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) ~[?:?]
2024-05-13T07:56:55.639923427Z at java.base/java.util.concurrent.FutureTask.run(Unknown Source) ~[?:?]
2024-05-13T07:56:55.639927844Z at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) ~[?:?]
2024-05-13T07:56:55.639932427Z at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:?]
2024-05-13T07:56:55.639936927Z at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:?]
2024-05-13T07:56:55.639941302Z at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[netty-common-4.1.108.Final.jar:4.1.108.Final]
2024-05-13T07:56:55.639946135Z at java.base/java.lang.Thread.run(Unknown Source) [?:?]
2024-05-13T07:56:55.639950635Z Caused by: java.util.concurrent.TimeoutException: Request raft-partition-partition-1-append-versioned to 172.20.0.4:26502 timed out in PT2.5S
2024-05-13T07:56:55.639955760Z at io.atomix.cluster.messaging.impl.NettyMessagingService.lambda$sendAndReceive$4(NettyMessagingService.java:261) ~[zeebe-atomix-cluster-8.5.0.jar:8.5.0]
2024-05-13T07:56:55.639960719Z ... 7 more
2024-05-13T07:56:55.641045469Z 2024-05-13 07:56:55.639 [Broker-0] [raft-server-0-1] [raft-server-1] WARN
2024-05-13T07:56:55.641072135Z io.atomix.raft.roles.LeaderAppender - RaftServer{raft-partition-partition-1} - Suspected network partition after 3 failures from 1 over a period of time 7905 > 4000, stepping down
2024-05-13T07:56:55.641078052Z 2024-05-13 07:56:55.639 [Broker-0] [raft-server-0-1] [raft-server-1] INFO
2024-05-13T07:56:55.641081927Z io.atomix.raft.impl.RaftContext - RaftServer{raft-partition-partition-1} - Transitioning to FOLLOWER
2024-05-13T07:56:55.642560927Z 2024-05-13 07:56:55.642 [Broker-0] [zb-actors-0] [ZeebePartition-1] INFO
2024-05-13T07:56:55.642590635Z io.camunda.zeebe.broker.system - Transition to FOLLOWER on term 2 requested.
2024-05-13T07:56:55.642850260Z 2024-05-13 07:56:55.642 [Broker-0] [zb-actors-0] [ZeebePartition-1] DEBUG
2024-05-13T07:56:55.642860510Z io.camunda.zeebe.broker.system - Partition role transitioning from LEADER to FOLLOWER in term 2
2024-05-13T07:56:55.642907802Z 2024-05-13 07:56:55.642 [Broker-0] [zb-actors-0] [ZeebePartition-1] INFO
2024-05-13T07:56:55.642912969Z io.camunda.zeebe.broker.system - Prepare transition from LEADER[term: 2] -> FOLLOWER[term: 2]
2024-05-13T07:56:55.643120594Z 2024-05-13 07:56:55.642 [Broker-0] [zb-actors-0] [ZeebePartition-1] INFO
2024-05-13T07:56:55.643133635Z io.camunda.zeebe.broker.system - Prepare transition from LEADER[term: 2] -> FOLLOWER[term: 2] - preparing Admin API
2024-05-13T07:56:55.643352260Z 2024-05-13 07:56:55.643 [Broker-0] [zb-actors-0] [ZeebePartition-1] INFO
2024-05-13T07:56:55.643366469Z io.camunda.zeebe.broker.system - Prepare transition from LEADER[term: 2] -> FOLLOWER[term: 2] - preparing BackupApiRequestHandler
2024-05-13T07:56:55.643860677Z 2024-05-13 07:56:55.643 [Broker-0] [zb-actors-0] [StreamProcessor-1] DEBUG
2024-05-13T07:56:55.643876177Z io.camunda.zeebe.logstreams - Paused processing for partition 1
2024-05-13T07:56:55.643880344Z 2024-05-13 07:56:55.643 [Broker-0] [zb-actors-0] [ZeebePartition-1] INFO
...
2024-05-13T07:56:55.721936427Z 2024-05-13 07:56:55.721 [Broker-0] [zb-actors-0] [ZeebePartition-1] INFO
2024-05-13T07:56:55.721941010Z io.camunda.zeebe.broker.system - Transition to FOLLOWER on term 2 - transitioning Migration
2024-05-13T07:56:55.722212719Z 2024-05-13 07:56:55.722 [Broker-0] [zb-actors-0] [ZeebePartition-1] INFO
2024-05-13T07:56:55.722219427Z org.camunda.feel.FeelEngine - Engine created. [value-mapper: CompositeValueMapper(List(io.camunda.zeebe.feel.impl.MessagePackValueMapper@2e6687f0)), function-provider: io.camunda.zeebe.feel.impl.FeelFunctionProvider@60c8725a, clock: io.camunda.zeebe.engine.processing.bpmn.clock.ZeebeFeelEngineClock@1fade08d, configuration: {externalFunctionsEnabled: false}]
2024-05-13T07:56:55.723461677Z 2024-05-13 07:56:55.723 [Broker-0] [zb-actors-0] [ZeebePartition-1] INFO
2024-05-13T07:56:55.723471427Z org.camunda.dmn.DmnEngine - DMN-Engine created. [value-mapper: CompositeValueMapper(List(io.camunda.zeebe.feel.impl.MessagePackValueMapper@a3e220)), function-provider: org.camunda.feel.context.FunctionProvider$EmptyFunctionProvider$@516474cb, audit-loggers: List(), configuration: Configuration(false,false,false)]
2024-05-13T07:56:55.723530760Z 2024-05-13 07:56:55.723 [Broker-0] [zb-actors-0] [ZeebePartition-1] INFO
...
2024-05-13T07:56:55.743220635Z 2024-05-13 07:56:55.743 [Broker-0] [zb-actors-0] [ZeebePartition-1] INFO
2024-05-13T07:56:55.743222552Z io.camunda.zeebe.broker.system - Transition to FOLLOWER on term 2 - transitioning Admin API
2024-05-13T07:56:55.743223344Z 2024-05-13 07:56:55.743 [Broker-0] [zb-actors-0] [ZeebePartition-1] INFO
2024-05-13T07:56:55.743224094Z io.camunda.zeebe.broker.system - Transition to FOLLOWER on term 2 completed
2024-05-13T07:56:55.743257510Z 2024-05-13 07:56:55.743 [Broker-0] [zb-fs-workers-1] [Exporter-1] DEBUG
2024-05-13T07:56:55.743260010Z io.camunda.zeebe.broker.exporter - Recovered exporter 'Exporter-1' from snapshot at lastExportedPosition -1
2024-05-13T07:56:55.743494635Z 2024-05-13 07:56:55.743 [Broker-0] [zb-fs-workers-1] [Exporter-1] DEBUG
2024-05-13T07:56:55.743498469Z io.camunda.zeebe.broker.exporter - Set event filter for exporters: io.camunda.zeebe.stream.api.EventFilter$$Lambda/0x000000a001a43b88@52ca3f98
2024-05-13T07:56:55.743499344Z 2024-05-13 07:56:55.743 [Broker-0] [zb-fs-workers-1] [Exporter-1] DEBUG
2024-05-13T07:56:55.743500219Z io.camunda.zeebe.broker.exporter - Closed exporter director 'Exporter-1'.
zbctl status
ssh@macbook ~ % zbctl --address 127.0.0.1:26500 --insecure status
# all brokers online
Cluster size: 4
Partitions count: 3
Replication factor: 2
Gateway version: 8.5.0
Brokers:
Broker 0 - 172.18.0.2:26501
Version: 8.5.0
Partition 1 : Leader, Healthy
Broker 1 - 172.18.0.4:26501
Version: 8.5.0
Partition 1 : Follower, Healthy
Partition 2 : Leader, Healthy
Broker 2 - 172.18.0.5:26501
Version: 8.5.0
Partition 2 : Follower, Healthy
Partition 3 : Leader, Healthy
Broker 3 - 172.18.0.6:26501
Version: 8.5.0
Partition 3 : Follower, Healthy
# killed Broker 1
ssh@macbook ~ % zbctl --address 127.0.0.1:26500 --insecure status
Cluster size: 4
Partitions count: 3
Replication factor: 2
Gateway version: 8.5.0
Brokers:
Broker 0 - 172.18.0.2:26501
Version: 8.5.0
Partition 1 : Follower, Healthy #what? why?
Broker 2 - 172.18.0.5:26501
Version: 8.5.0
Partition 2 : Follower, Healthy
Partition 3 : Leader, Healthy
Broker 3 - 172.18.0.6:26501
Version: 8.5.0
Partition 3 : Follower, Healthy
docker-compose
version: "2"
networks:
zeebe_network:
driver: bridge
services:
gateway:
restart: always
container_name: gateway
image: camunda/zeebe:${CAMUNDA_PLATFORM_VERSION}
environment:
- ZEEBE_LOG_LEVEL=debug
- ZEEBE_STANDALONE_GATEWAY=true
- ZEEBE_GATEWAY_NETWORK_HOST=0.0.0.0
- ZEEBE_GATEWAY_NETWORK_PORT=26500
- ZEEBE_GATEWAY_CLUSTER_INITIALCONTACTPOINTS=node0:26502,node1:26502,node2:26502,node3:26502
- ZEEBE_GATEWAY_CLUSTER_PORT=26502
- ZEEBE_GATEWAY_CLUSTER_HOST=gateway
- ZEEBE_GATEWAY_CLUSTER_BROADCASTUPDATES=true
ports:
- "26500:26500"
networks:
- zeebe_network
node0:
container_name: zeebe_broker_1
image: camunda/zeebe:${CAMUNDA_PLATFORM_VERSION}
environment:
- ZEEBE_LOG_LEVEL=debug
- ZEEBE_BROKER_CLUSTER_NODEID=0
- ZEEBE_BROKER_CLUSTER_PARTITIONSCOUNT=3
- ZEEBE_BROKER_CLUSTER_REPLICATIONFACTOR=2
- ZEEBE_BROKER_CLUSTER_CLUSTERSIZE=4
- ZEEBE_BROKER_CLUSTER_INITIALCONTACTPOINTS=node0:26502,node1:26502,node2:26502,node3:26502
- ZEEBE_BROKER_CLUSTER_ELECTIONTIMEOUT=2000ms
- ZEEBE_BROKER_CLUSTER_HEARTBEATINTERVAL=200ms
networks:
- zeebe_network
node1:
container_name: zeebe_broker_2
image: camunda/zeebe:${CAMUNDA_PLATFORM_VERSION}
environment:
- ZEEBE_LOG_LEVEL=debug
- ZEEBE_BROKER_CLUSTER_NODEID=1
- ZEEBE_BROKER_CLUSTER_PARTITIONSCOUNT=3
- ZEEBE_BROKER_CLUSTER_REPLICATIONFACTOR=2
- ZEEBE_BROKER_CLUSTER_CLUSTERSIZE=4
- ZEEBE_BROKER_CLUSTER_INITIALCONTACTPOINTS=node0:26502,node1:26502,node2:26502,node3:26502
- ZEEBE_BROKER_CLUSTER_ELECTIONTIMEOUT=2000ms
- ZEEBE_BROKER_CLUSTER_HEARTBEATINTERVAL=200ms
networks:
- zeebe_network
depends_on:
- node0
node2:
container_name: zeebe_broker_3
image: camunda/zeebe:${CAMUNDA_PLATFORM_VERSION}
environment:
- ZEEBE_LOG_LEVEL=debug
- ZEEBE_BROKER_CLUSTER_NODEID=2
- ZEEBE_BROKER_CLUSTER_PARTITIONSCOUNT=3
- ZEEBE_BROKER_CLUSTER_REPLICATIONFACTOR=2
- ZEEBE_BROKER_CLUSTER_CLUSTERSIZE=4
- ZEEBE_BROKER_CLUSTER_INITIALCONTACTPOINTS=node0:26502,node1:26502,node2:26502,node3:26502
- ZEEBE_BROKER_CLUSTER_ELECTIONTIMEOUT=2000ms
- ZEEBE_BROKER_CLUSTER_HEARTBEATINTERVAL=200ms
networks:
- zeebe_network
depends_on:
- node1
node3:
container_name: zeebe_broker_4
image: camunda/zeebe:${CAMUNDA_PLATFORM_VERSION}
environment:
- ZEEBE_LOG_LEVEL=debug
- ZEEBE_BROKER_CLUSTER_NODEID=3
- ZEEBE_BROKER_CLUSTER_PARTITIONSCOUNT=3
- ZEEBE_BROKER_CLUSTER_REPLICATIONFACTOR=2
- ZEEBE_BROKER_CLUSTER_CLUSTERSIZE=4
- ZEEBE_BROKER_CLUSTER_INITIALCONTACTPOINTS=node0:26502,node1:26502,node2:26502,node3:26502
- ZEEBE_BROKER_CLUSTER_ELECTIONTIMEOUT=2000ms
- ZEEBE_BROKER_CLUSTER_HEARTBEATINTERVAL=200ms
networks:
- zeebe_network
depends_on:
- node2