Hi there!
I’ve successfully deployed Zeebe cluster in k8s with 3 nodes and 3 partitions. It seemed stable but I found storages almost exhausted (6 GB) in a day or two (cluster didn’t execute any workflows). I started to investigate the problem and found a lot of useful information about performance details in issue 541.
I keep snapshot size, max amount of snapshots and their generation intensity default (500 MB, 3 at most and 15 minutes, respectively). On client-side I increased poolInterval up to 5000 ms in order to reduce the pooling intensity (and thus reduce log intensity as well).
After all, the problem persists but storage utilization grows not so fast as before (only 5K events per minute, hehe). I took a look at segments content and found a lot of repetitive information:
�type�print�worker�test-worker-name�timeout�’�maxJobsToActivate�jobKeys��jobs��variables��truncated�ring�worker�test-worker-name�timeout�’�maxJobsToActivate�jobKeys��jobs��variables��truncated�rker�test-worker-name�timeout�’�maxJobsToActivate�jobKeys��jobs��variables��truncated·worker�test-worker-name�timeout�’�maxJobsToActivate�jobKeys��jobs��variables��truncated�ame�timeout�’�maxJobsToActivate�jobKeys��jobs��variables��truncated�t�’�maxJobsToActivate�jobKeys��jobs��variables��truncated¿ctivate�jobKeys��jobs��variables��truncated�s��variables��truncated¿�truncated�e�mirror_string�worker�test-worker-name�timeout�’�maxJobsToActivate�jobKeys��jobs��variables��truncated�r�test-worker-name�timeout�’�maxJobsToActivate�jobKeys��jobs��variables��truncated�test-worker-name�timeout�’�maxJobsToActivate�jobKeys��jobs��variables��truncated�me�timeout�’�maxJobsToActivate�jobKeys��jobs��variables��truncated�xJobsToActivate�jobKeys��jobs��variables��truncated·
I use a custom Hazelcast exporter (which connects to external Hazelcast cluster). Nevertheless, I care about update the current cursor position in order to allow Zeebe to clean up segments when it’s necessary.
Could you explain why so many events are produced and how to reduce resource usage?
Here is my config:
# For more information about this configuration visit:
[threads]
cpuThreadCount = 1
[gateway.monitoring]
enabled = true
[[exporters]]
id = "hazelcast"
className = "org.project.HazelcastExporter"
[exporters.args]
host = "hazelcast.dev.svc.cluster.local"
enabledValueTypes = "JOB,WORKFLOW_INSTANCE,DEPLOYMENT,INCIDENT,TIMER,VARIABLE,MESSAGE,MESSAGE_SUBSCRIPTION,MESSAGE_START_EVENT_SUBSCRIPTION"
Here you can find a pod resource usage graph
Here is the zeebe_exporter_events_total metric graph