Operate process view 4 hours behind zeebe

matt · May 30, 2022, 3:56pm

Hello,

I have just run a test starting 10000 processes with each process passing through roughly 50 workers. At the end of each process an entry is put in the database. After about an hour I can see that all processes have completed and the worker application is now idle. But when I look in the Operate process view it only show 400 processes as being completed. It was only 4 hours later that it finally caught up.

Do you know why it got so far behind and took so long to catch up?

I have attached the config and docker-compose below

docker-compose

version: "3.7"
services:

  broker0:
    image: camunda/zeebe:8.0.2
    ports:
      - "26500:26500"
      - "9600:9600"
    environment:
      - "JAVA_TOOL_OPTIONS=-Xms512m -Xmx512m"
    volumes:
      - broker0:/usr/local/zeebe/data
      - ./skins/zeebeledger/broker0-application.yaml:/usr/local/zeebe/config/application.yaml
      - ./skins/zeebeledger/broker-log4j2.xml:/usr/local/zeebe/config/log4j2.xml
    networks:
      default:
    depends_on:
      - elasticsearch
    profiles:
      - cluster

  broker1:
    image: camunda/zeebe:8.0.2
    environment:
      - "JAVA_TOOL_OPTIONS=-Xms512m -Xmx512m"
    volumes:
      - broker1:/usr/local/zeebe/data
      - ./skins/zeebeledger/broker1-application.yaml:/usr/local/zeebe/config/application.yaml
      - ./skins/zeebeledger/broker-log4j2.xml:/usr/local/zeebe/config/log4j2.xml
    networks:
      default:
    depends_on:
      - elasticsearch
    profiles:
      - cluster

  broker2:
    image: camunda/zeebe:8.0.2
    environment:
      - "JAVA_TOOL_OPTIONS=-Xms512m -Xmx512m"
    volumes:
      - broker2:/usr/local/zeebe/data
      - ./skins/zeebeledger/broker2-application.yaml:/usr/local/zeebe/config/application.yaml
      - ./skins/zeebeledger/broker-log4j2.xml:/usr/local/zeebe/config/log4j2.xml
    networks:
      default:
    depends_on:
      - elasticsearch
    profiles:
      - cluster

  operate:
    image: camunda/operate:8.0.2
    ports:
      - "8081:8080"
    environment:
      - CAMUNDA_OPERATE_ZEEBE_GATEWAYADDRESS=broker0:26500
      - CAMUNDA_OPERATE_ELASTICSEARCH_URL=http://elasticsearch:9200
      - CAMUNDA_OPERATE_ZEEBEELASTICSEARCH_URL=http://elasticsearch:9200
    networks:
      default:
    depends_on:
      - elasticsearch

  tasklist:
    image: camunda/tasklist:8.0.2
    ports:
      - "8082:8080"
    environment:
      - CAMUNDA_TASKLIST_ZEEBE_GATEWAYADDRESS=broker0:26500
      - CAMUNDA_TASKLIST_ELASTICSEARCH_URL=http://elasticsearch:9200
      - CAMUNDA_TASKLIST_ZEEBEELASTICSEARCH_URL=http://elasticsearch:9200
    networks:
      default:
    depends_on:
      - elasticsearch

  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:7.17.0
    ports:
      - "9200:9200"
      - "9300:9300"
    environment:
      - bootstrap.memory_lock=true
      - discovery.type=single-node
      # allow running with low disk space
      - cluster.routing.allocation.disk.threshold_enabled=false
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    healthcheck:
      test: [ "CMD-SHELL", "curl -f http://localhost:9200/_cat/health | grep -q green" ]
      interval: 30s
      timeout: 5s
      retries: 3
    volumes:
      - elastic:/usr/share/elasticsearch/data
    networks:
      default:

volumes:
  broker0:
    driver: local
  broker1:
    driver: local
  broker2:
    driver: local
  elastic:
    driver: local

broker application.yaml (all 3 brokers the same expect for the nodeId which is 0|1|2)

zeebe:
  broker:
    gateway:
     enable: true
    network:
      host: 0.0.0.0
      security:
        enabled: false
    cluster:
      nodeId: 0
      partitionsCount: 2
      replicationFactor: 2
      clusterSize: 3
      initialContactPoints: [ broker0:26502, broker1:26502, broker2:26502 ]
      clusterName: zeebe-cluster
    threads:
      cpuThreadCount: 10
    backpressure:
      enabled : true
      algorithm: "fixed"
      fixed:
        limit: 100
    exporters:
      elasticsearch:
        className: io.camunda.zeebe.exporter.ElasticsearchExporter
        args:
          url: http://elasticsearch:9200
          bulk:
            size: 1000

Thanks,
Matt

Zelldon · May 31, 2022, 6:51am

Thanks @matt for raising this here.

This is a topic which we actively investigating and working on, e.g. there are some improvements which we are planning to do on the Operate importer side. What you can do is to also scale the used elasticsearch, because this has also a large impact on that topic. Furthermore it makes sense to give operate enough resources and threads. For details about the configuration please take look at the documentation Importer and archiver | Camunda Platform 8

Hope that helps.

Greets
Chris

matt · May 31, 2022, 9:39am

Thanks @Zelldon,

I think I had a misunderstanding of how things were working. I thought the exporters in Zeebe were what populated elasticsearch for Operate to use, but it sounds like Operate maintains its own data by importing from Zeebe and storing in elasticsearch, is that correct?
This would explain the slowness as Operate was left with its basic configuration.

Thanks,
Matt

Zelldon · May 31, 2022, 9:51am

Hey @matt

it is half correct

Exporters are running in Zeebe, and you can for example enable the Elasticsearch exporter which allows you to stream data from the Zeebe log stream to elasticsearch. The records which are exported are in a general format, which can be consumed by any application.

Operate needs to aggregate such records/data. This means it reads the data from elastic (the importer) and aggregates them into their own format and writes it back to elasticsearch.

Does this help you?

Greets
Chris

jwulf · May 31, 2022, 10:39pm

See here for documentation on the architecture: Reporting about processes | Camunda Platform 8