We are upgrading zeebe from 8.2.12 to the 8.4.0 (even 8.4.5 failed) but zeebe brokers errors out on start up with an error java.util.concurrent.CompletionException: io.atomix.cluster.messaging.MessagingException$RemoteHandlerFailure: Remote handler failed to handle message, cause: Failed to handle message, host dev-zeebe-0.dev-zeebe.default.svc:26502 is not a known cluster member
The helm chart version for 8.4.0 was 9.0.2. We even tried starting up 8.4.5 and got the same error. We also tried with the latest (8.5.0-alpha2) locally on my laptop with the below helm command and saw the same issue in the logs for zeebe broker 0
helm install dev camunda/camunda-platform --set identity.enabled=false --set optimize.enabled=false --set tasklist.enabled=false --set operate.enabled=false --set connectors.enabled=false --set zeebe.affinity.podAntiAffinity=null --set zeebe-gateway.affinity.podAntiAffinity=null --set global.identity.auth.enabled=false
The installation fails on AWS setup with ec2 instances and even locally on a laptop.
Probably I am missing something in the configuration. Any insights on what could be going wrong?
The helm values configuration file is
global:
identity:
auth:
enabled: false
image:
tag: 8.4.0
identity:
enabled: false
optimize:
enabled: false
tasklist:
enabled: false
operate:
enabled: false
elasticsearch:
enabled: true
image:
repository: bitnami/elasticsearch
tag: 8.3.2
master:
replicaCount: 1
resources:
requests:
cpu: 1
memory: 2Gi
limits:
cpu: 1
memory: 2Gi
connectors:
enabled: false
zeebe:
clusterSize: 3
partitionCount: 3
replicationFactor: 1
cpuThreadCount: 4
ioThreadCount: 4
logLevel: info
retention:
enabled: true
minimumAge: 10d
affinity:
podAntiAffinity: null
env:
- name: ZEEBE_BROKER_EXECUTION_METRICS_EXPORTER_ENABLED
value: "true"
pvcSize: 128Gi
resources:
requests:
cpu: 1
memory: 512Mi
limits:
cpu: 1
memory: 512Mi
zeebe-gateway:
replicas: 2
affinity:
podAntiAffinity: null
env:
- name: ZEEBE_GATEWAY_THREADS_MANAGEMENTTHREADS
value: "4"
- name: ZEEBE_GATEWAY_MONITORING_ENABLED
value: "true"
resources:
requests:
cpu: 1
memory: 512Mi
limits:
cpu: 1
memory: 512Mi
The complete stack trace is
2024-03-28 08:36:56.153 [] [atomix-cluster-heartbeat-sender] [] INFO
io.atomix.cluster.protocol.swim - 0 - Member added Member{id=2, address=dev-zeebe-2.dev-zeebe.default.svc:26502, properties={}}
2024-03-28 08:36:56.184 [Broker-0] [zb-actors-1] [] WARN
io.camunda.zeebe.topology.gossip.ClusterTopologyGossiper - Failed to sync with 2
java.util.concurrent.CompletionException: io.atomix.cluster.messaging.MessagingException$RemoteHandlerFailure: Remote handler failed to handle message, cause: Failed to handle message, host dev-zeebe-0.dev-zeebe.default.svc:26502 is not a known cluster member
at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(Unknown Source) ~[?:?]
at java.base/java.util.concurrent.CompletableFuture.completeThrowable(Unknown Source) ~[?:?]
at java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(Unknown Source) ~[?:?]
at java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown Source) ~[?:?]
at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(Unknown Source) ~[?:?]
at io.atomix.cluster.messaging.impl.NettyMessagingService.lambda$executeOnPooledConnection$25(NettyMessagingService.java:626) ~[zeebe-atomix-cluster-8.4.0.jar:8.4.0]
at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31) ~[guava-33.0.0-jre.jar:?]
at io.atomix.cluster.messaging.impl.NettyMessagingService.lambda$executeOnPooledConnection$26(NettyMessagingService.java:624) ~[zeebe-atomix-cluster-8.4.0.jar:8.4.0]
at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(Unknown Source) ~[?:?]
at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(Unknown Source) ~[?:?]
at java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown Source) ~[?:?]
at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(Unknown Source) ~[?:?]
at io.atomix.cluster.messaging.impl.AbstractClientConnection.dispatch(AbstractClientConnection.java:49) ~[zeebe-atomix-cluster-8.4.0.jar:8.4.0]
at io.atomix.cluster.messaging.impl.AbstractClientConnection.dispatch(AbstractClientConnection.java:30) ~[zeebe-atomix-cluster-8.4.0.jar:8.4.0]
at io.atomix.cluster.messaging.impl.NettyMessagingService$MessageDispatcher.channelRead0(NettyMessagingService.java:1109) ~[zeebe-atomix-cluster-8.4.0.jar:8.4.0]
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99) ~[netty-transport-4.1.104.Final.jar:4.1.104.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) ~[netty-transport-4.1.104.Final.jar:4.1.104.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[netty-transport-4.1.104.Final.jar:4.1.104.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) ~[netty-transport-4.1.104.Final.jar:4.1.104.Final]
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:346) ~[netty-codec-4.1.104.Final.jar:4.1.104.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:318) ~[netty-codec-4.1.104.Final.jar:4.1.104.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) ~[netty-transport-4.1.104.Final.jar:4.1.104.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[netty-transport-4.1.104.Final.jar:4.1.104.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) ~[netty-transport-4.1.104.Final.jar:4.1.104.Final]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) ~[netty-transport-4.1.104.Final.jar:4.1.104.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440) ~[netty-transport-4.1.104.Final.jar:4.1.104.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[netty-transport-4.1.104.Final.jar:4.1.104.Final]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) ~[netty-transport-4.1.104.Final.jar:4.1.104.Final]
at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:800) ~[netty-transport-classes-epoll-4.1.104.Final.jar:4.1.104.Final]
at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:509) ~[netty-transport-classes-epoll-4.1.104.Final.jar:4.1.104.Final]
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:407) ~[netty-transport-classes-epoll-4.1.104.Final.jar:4.1.104.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) ~[netty-common-4.1.104.Final.jar:4.1.104.Final]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[netty-common-4.1.104.Final.jar:4.1.104.Final]
at java.base/java.lang.Thread.run(Unknown Source) ~[?:?]
Caused by: io.atomix.cluster.messaging.MessagingException$RemoteHandlerFailure: Remote handler failed to handle message, cause: Failed to handle message, host dev-zeebe-0.dev-zeebe.default.svc:26502 is not a known cluster member
... 22 more