When does the Leader Role transition happen for a Partition

prasanna_sudharsan · June 28, 2021, 4:34am

Hi,
We are evaluating zeebe for one of our use case, we tried the below configs to evaluate
how zeebe works on replication and leader role transition, when one or more broker goes down
Config:
Cluster size 5
partition count 4
replication factor 2

Below is the status from zbctl

{
  "brokers": [
    {
      "nodeId": 3,
      "host": "zb-zeebe-3.zb-zeebe.default.svc.cluster.local",
      "port": 26501,
      "partitions": [
        {
          "partitionId": 4,
          "role": "LEADER",
          "health": "HEALTHY"
        },
        {
          "partitionId": 3,
          "role": "LEADER",
          "health": "HEALTHY"
        }
      ],
      "version": "1.0.0"
    },
    {
      "nodeId": 4,
      "host": "zb-zeebe-4.zb-zeebe.default.svc.cluster.local",
      "port": 26501,
      "partitions": [
        {
          "partitionId": 4,
          "role": "FOLLOWER",
          "health": "HEALTHY"
        }
      ],
      "version": "1.0.0"
    },
    {
      "nodeId": 0,
      "host": "zb-zeebe-0.zb-zeebe.default.svc.cluster.local",
      "port": 26501,
      "partitions": [
        {
          "partitionId": 1,
          "role": "LEADER",
          "health": "HEALTHY"
        }
      ],
      "version": "1.0.0"
    },
    {
      "nodeId": 2,
      "host": "zb-zeebe-2.zb-zeebe.default.svc.cluster.local",
      "port": 26501,
      "partitions": [
        {
          "partitionId": 3,
          "role": "FOLLOWER",
          "health": "HEALTHY"
        },
        {
          "partitionId": 2,
          "role": "LEADER",
          "health": "HEALTHY"
        }
      ],
      "version": "1.0.0"
    },
    {
      "nodeId": 1,
      "host": "zb-zeebe-1.zb-zeebe.default.svc.cluster.local",
      "port": 26501,
      "partitions": [
        {
          "partitionId": 1,
          "role": "FOLLOWER",
          "health": "HEALTHY"
        },
        {
          "partitionId": 2,
          "role": "FOLLOWER",
          "health": "HEALTHY"
        }
      ],
      "version": "1.0.0"
    }
  ],
  "clusterSize": 5,
  "partitionsCount": 4,
  "replicationFactor": 2,
  "gatewayVersion": "1.0.0"
}

Now, we scaled down the pods to 2,
we could able to deploy the workflow, minimum available server should be 2 (as per the quorum logic ), we assumed it is able to accept

Below is the cluster status now, able to understand that

{
  "brokers": [
    {
      "nodeId": 0,
      "host": "zb-zeebe-0.zb-zeebe.default.svc.cluster.local",
      "port": 26501,
      "partitions": [
        {
          "partitionId": 1,
          "role": "LEADER",
          "health": "HEALTHY"
        }
      ],
      "version": "1.0.0"
    },
    {
      "nodeId": 1,
      "host": "zb-zeebe-1.zb-zeebe.default.svc.cluster.local",
      "port": 26501,
      "partitions": [
        {
          "partitionId": 1,
          "role": "FOLLOWER",
          "health": "HEALTHY"
        },
        {
          "partitionId": 2,
          "role": "FOLLOWER",
          "health": "HEALTHY"
        }
      ],
      "version": "1.0.0"
    }
  ],
  "clusterSize": 5,
  "partitionsCount": 4,
  "replicationFactor": 2,
  "gatewayVersion": "1.0.0"
}

Our Doubts are?

Why for partition 2 . nodeid 1 didn’t become a leader, as per the zeebe docs,1st node should have become leader for partition id 2, cmiiw?
is Quorum value computed at cluster level or partitions level ?
What happens in the above scenario, if job workers were in the middle of execution, and particular broker holding partition is down?
In another scenario, we kept the partition as 2, and replication as 2 , cluster size as 5, but only 3 nodes were up, remaining two node’s rediness probe is returning 503, what could be the reason for this?
In future, if we want to upgrade the zeebe cluster, what use cases we should consider with respect to resilience and fault tolerance
we are planning the set up zeebe broker on aws, does zeebe cluster available as aws ami image, similar to elastic search?

prasanna_sudharsan · June 28, 2021, 4:21pm

Could some one help us out here on these queries?

prasanna_sudharsan · June 29, 2021, 3:47am

@korthout could you help us out on these queries?

Zelldon · June 29, 2021, 4:38am

Hey @Prasanna

Some of these questions can be answered with this answer Worker contact point in a clustered environment - #2 by Zelldon

You should use a higher replication factor for your tests.
I can add more details later.

Greets
Chris

prasanna_sudharsan · June 29, 2021, 5:00am

Thanks @Zelldon, will increase the replication factor, and try out the same, but does zeebe cluster size depends on partition count, because with partition count as 2 , replication factor as 2 and cluster size as 5, only 3 pods are coming up. other two brokers are returning unhealthy

Once again, Thanks for the response

Zelldon · June 29, 2021, 6:17am

Hey @Prasanna

I wrote once a script which gives you more details into this. Take a look at it here

If you run it, it shows you the partition distribution and gives you the answer you are looking for.

$ ./partitionDistribution.sh 
Expected to be called with three arguments, like:
'./partitionDistribution.sh {nodes} {partitionCount} {replicationFactor}
$ ./partitionDistribution.sh 5 2 2
Distribution:
P\N|	N 0|	N 1|	N 2|	N 3|	N 4
P 0|	L  |	F  |	-  |	-  |	-  
P 1|	-  |	L  |	F  |	-  |	-  

Partitions per Node:
N 0: 1
N 1: 2
N 2: 1
N 3: 0
N 4: 0

If you have only two partitions and replication factor two then two nodes have no partitions to work with, which means they will not start the partition service and in consequence becoming not healthy.

Hope that helps.

Greets
Chris

prasanna_sudharsan · June 30, 2021, 4:39am

Thanks @Zelldon, we increased the replication factor, we could able to see the leader role transition, when we scaled down the brokers.

Also for metrics, Our client has internal metric framework which is adaptable when we use micrometer, since zeebe is using prometheus client library, if there is any work around could you give some pointers to it, it would be helpful

Zelldon · June 30, 2021, 5:18am

Hey @Prasanna

currently we only support Prometheus. I think there is a related issue which covers it Consider using Micrometer as metrics facade · Issue #3527 · camunda-cloud/zeebe · GitHub, but I think there is currently no plan on working on it. Please add a comment if you see need for it.

Edit: I saw you already added a comment.

Greets
Chris