Hi all, recently we adopted Camunda 8.2 Self Managed solution on Kubernetes using the helm chart provided in the Camunda documentation and have been facing Resource Exhausted error.
2023-12-21T08:28:55.441294942Z stderr F 08:28:55.440 | zeebe | [io.camunda.zeebe:userTask] ERROR: Grpc Stream Error: 8 RESOURCE_EXHAUSTED: Expected to activate jobs of type 'io.camunda.zeebe:userTask', but no jobs available and at least one broker returned 'RESOURCE_EXHAUSTED'. Please try again later.
Configuration:
3 broker
3 partition
3 replication factors.
** How did we try to resolve it? **
We increased the memory allocated and restarted the broker. After this, 2 brokers were ready BUT the third one started giving 503 error (though pods are not getting restarted)
** Affected Pod logs**
2023-12-21 13:05:26.501 [] [Thread-14] INFO
io.atomix.raft.partition.impl.RaftPartitionServer - RaftPartitionServer{raft-partition-partition-2} - Starting server for partition PartitionId{id=2, group=raft-partition}
2023-12-21 13:05:26.500 [] [raft-server-0-raft-partition-partition-3] INFO
io.atomix.raft.impl.RaftContext - RaftServer{raft-partition-partition-3} - Transitioning to FOLLOWER
2023-12-21 13:05:26.507 [] [raft-server-0-raft-partition-partition-3] INFO
io.atomix.raft.impl.DefaultRaftServer - RaftServer{raft-partition-partition-3} - Server join completed. Waiting for the server to be READY
2023-12-21 13:05:27.976 [] [raft-server-0-raft-partition-partition-3] INFO
io.atomix.raft.impl.RaftContext - RaftServer{raft-partition-partition-3} - Found leader 2
2023-12-21 13:05:27.981 [] [raft-server-0-raft-partition-partition-3] INFO
io.atomix.raft.impl.RaftContext - RaftServer{raft-partition-partition-3} - Setting firstCommitIndex to 67793541. RaftServer is ready only after it has committed events upto this index
2023-12-21 13:05:27.982 [] [raft-server-0-raft-partition-partition-3] INFO
io.atomix.raft.impl.RaftContext - RaftServer{raft-partition-partition-3} - Commit index is 67793325. RaftServer is ready only after it has committed events up to index 67793541
2023-12-21 13:05:33.062 [] [raft-server-0-raft-partition-partition-2] INFO
io.atomix.raft.impl.RaftContext - RaftServer{raft-partition-partition-2} - Transitioning to FOLLOWER
2023-12-21 13:05:33.062 [] [Thread-14] INFO
io.atomix.raft.partition.impl.RaftPartitionServer - RaftPartitionServer{raft-partition-partition-1} - Starting server for partition PartitionId{id=1, group=raft-partition}
2023-12-21 13:05:33.063 [] [raft-server-0-raft-partition-partition-2] INFO
io.atomix.raft.impl.DefaultRaftServer - RaftServer{raft-partition-partition-2} - Server join completed. Waiting for the server to be READY
2023-12-21 13:05:33.198 [] [raft-server-0-raft-partition-partition-2] INFO
io.atomix.raft.impl.RaftContext - RaftServer{raft-partition-partition-2} - Found leader 1
2023-12-21 13:05:33.249 [] [raft-server-0-raft-partition-partition-3] INFO
io.atomix.raft.impl.RaftContext - RaftServer{raft-partition-partition-3} - Commit index is 67793553. RaftServer is ready
2023-12-21 13:05:33.250 [] [raft-server-0-raft-partition-partition-3] INFO
io.atomix.raft.partition.impl.RaftPartitionServer - RaftPartitionServer{raft-partition-partition-3} - Successfully started server for partition PartitionId{id=3, group=raft-partition} in 23865ms
2023-12-21 13:05:37.074 [] [raft-server-0-raft-partition-partition-2] INFO
io.atomix.raft.impl.RaftContext - RaftServer{raft-partition-partition-2} - Setting firstCommitIndex to 68024967. RaftServer is ready only after it has committed events upto this index
2023-12-21 13:05:37.074 [] [raft-server-0-raft-partition-partition-2] INFO
io.atomix.raft.impl.RaftContext - RaftServer{raft-partition-partition-2} - Commit index is 67962333. RaftServer is ready only after it has committed events up to index 68024967
2023-12-21 13:05:37.155 [] [raft-server-0-raft-partition-partition-2] INFO
io.atomix.raft.roles.FollowerRole - RaftServer{raft-partition-partition-2}{role=FOLLOWER} - Started receiving new snapshot FileBasedReceivedSnapshot{directory=/usr/local/zeebe/data/raft-partition/partitions/2/pending/68023836-65413-68072070-68072080-1, snapshotStore=Broker-0-SnapshotStore-2, metadata=FileBasedSnapshotId{index=68023836, term=65413, processedPosition=68072070, exporterPosition=68072080}} from 1
2023-12-21 13:05:42.231 [] [raft-server-0-raft-partition-partition-1] INFO
io.atomix.raft.impl.RaftContext - RaftServer{raft-partition-partition-1} - Transitioning to FOLLOWER
2023-12-21 13:05:42.231 [] [raft-server-0-raft-partition-partition-1] INFO
io.atomix.raft.impl.DefaultRaftServer - RaftServer{raft-partition-partition-1} - Server join completed. Waiting for the server to be READY
2023-12-21 13:05:42.289 [] [raft-server-0-raft-partition-partition-1] INFO
io.atomix.raft.impl.RaftContext - RaftServer{raft-partition-partition-1} - Found leader 1
2023-12-21 13:05:49.820 [] [raft-server-0-raft-partition-partition-1] INFO
io.atomix.raft.impl.RaftContext - RaftServer{raft-partition-partition-1} - Setting firstCommitIndex to 68238847. RaftServer is ready only after it has committed events upto this index
2023-12-21 13:05:49.821 [] [raft-server-0-raft-partition-partition-1] INFO
io.atomix.raft.impl.RaftContext - RaftServer{raft-partition-partition-1} - Commit index is 68127423. RaftServer is ready only after it has committed events up to index 68238847
2023-12-21 13:05:49.852 [] [raft-server-0-raft-partition-partition-2] INFO
io.atomix.raft.roles.FollowerRole - RaftServer{raft-partition-partition-2}{role=FOLLOWER} - Rolling back snapshot FileBasedReceivedSnapshot{directory=/usr/local/zeebe/data/raft-partition/partitions/2/pending/68023836-65413-68072070-68072080-1, snapshotStore=Broker-0-SnapshotStore-2, metadata=FileBasedSnapshotId{index=68023836, term=65413, processedPosition=68072070, exporterPosition=68072080}}
2023-12-21 13:05:49.853 [] [raft-server-0-raft-partition-partition-2] INFO
io.atomix.raft.roles.FollowerRole - RaftServer{raft-partition-partition-2}{role=FOLLOWER} - Started receiving new snapshot FileBasedReceivedSnapshot{directory=/usr/local/zeebe/data/raft-partition/partitions/2/pending/68023836-65413-68072070-68072080-2, snapshotStore=Broker-0-SnapshotStore-2, metadata=FileBasedSnapshotId{index=68023836, term=65413, processedPosition=68072070, exporterPosition=68072080}} from 1
2023-12-21 13:27:17.067 [] [raft-server-0-raft-partition-partition-2] INFO
io.atomix.raft.impl.RaftContext - RaftServer{raft-partition-partition-2} - Found leader 2
2023-12-21 13:27:17.112 [] [raft-server-0-raft-partition-partition-2] INFO
io.atomix.raft.roles.FollowerRole - RaftServer{raft-partition-partition-2}{role=FOLLOWER} - Started receiving new snapshot FileBasedReceivedSnapshot{directory=/usr/local/zeebe/data/raft-partition/partitions/2/pending/68035306-65488-68083723-68083606-84, snapshotStore=Broker-0-SnapshotStore-2, metadata=FileBasedSnapshotId{index=68035306, term=65488, processedPosition=68083723, exporterPosition=68083606}} from 2
2023-12-21 13:27:28.722 [] [raft-server-0-raft-partition-partition-2] INFO
io.atomix.raft.roles.FollowerRole - RaftServer{raft-partition-partition-2}{role=FOLLOWER} - Rolling back snapshot FileBasedReceivedSnapshot{directory=/usr/local/zeebe/data/raft-partition/partitions/2/pending/68035306-65488-68083723-68083606-84, snapshotStore=Broker-0-SnapshotStore-2, metadata=FileBasedSnapshotId{index=68035306, term=65488, processedPosition=68083723, exporterPosition=68083606}}
2023-12-21 13:27:33.606 [] [raft-server-0-raft-partition-partition-1] INFO
io.atomix.raft.impl.RaftContext - RaftServer{raft-partition-partition-1} - Found leader 2
2023-12-21 13:27:33.649 [] [raft-server-0-raft-partition-partition-1] INFO
io.atomix.raft.roles.FollowerRole - RaftServer{raft-partition-partition-1}{role=FOLLOWER} - Started receiving new snapshot FileBasedReceivedSnapshot{directory=/usr/local/zeebe/data/raft-partition/partitions/1/pending/68246678-61620-68289730-68289744-74, snapshotStore=Broker-0-SnapshotStore-1, metadata=FileBasedSnapshotId{index=68246678, term=61620, processedPosition=68289730, exporterPosition=68289744}} from 2
I am new to this, might have missed some information that can be of help here. Any help would be appreciated.
Thank You!