JobWorker of Type io.camunda.zeebe:userTask not triggered as expected

Hi all,

Following setup:
Spring Boot Version: 3.2.7
Spring Boot Starter Camunda: 8.5.3
JDK: 21

Running Camunda via Docker Compose with Version: 8.5.3

We have a Controller which is triggered from the Frontend and creates a new Process instance:

    @Transactional
    public void createProcessInstance(String bpmnProcessId, Long applicationId) {
        .... some preparation and checks ....

        ZeebeCreateInstanceVariablesDto variablesDto = ZeebeCreateInstanceVariablesDto.builder()
                .applicantId(user.getId())
                .applicationId(applicationId)
                .build();

        Long processInstanceId = zeebeClient.newCreateInstanceCommand()
                .bpmnProcessId(bpmnProcessId)
                .latestVersion()
                .variables(variablesDto)
                .send()
                .join()
                .getProcessInstanceKey();

        .... updating entiy here ....
    }

After this the process is started and goes to the first UserTask in our Camunda Process.
Now we would expect that the Job Worker of type io.camunda.zeebe:userTask is triggered.

    @Transactional
    @JobWorker(type = "io.camunda.zeebe:userTask", autoComplete = false, fetchAllVariables = true, streamEnabled = false)
    public void handleUserTaskCreate(
            final ActivatedJob job,
            @CustomHeaders final Map<String, String> headers,
            @Variable(name = "applicationId") final Long applicationId,
            @VariablesAsType final JsonNode variables
    ) {
        final Long stepKey = job.getKey();
        final String stepName = job.getElementId();

            final String assigneeHeader = headers.getOrDefault("io.camunda.zeebe:assignee", null);
            final Long assigneeId = Long.parseLong(assigneeHeader);
            final String jsonFormHeader = headers.getOrDefault("jsonForm", null);

        .... load entites for checks and get data which is saved in the process step entity at the end ....


        }

        // this is an attempt for a workaround
        log.info("Deactivate retries for step {}", stepKey);
        zeebeClient.newUpdateRetriesCommand(job)
                .retries(0)
                .send();
    }

When we start our setup with Docker Compose and start the process, it is not allways working as expected. The Application is successfully persisted, but the UserTask most of the time is not triggered on the first try. We don’t really see a log but on elastic we see that the process job is created, but goes to intent timeout. When we start another process it works directly and behaves as expected. Additionally when we restart our spring boot app the not triggered user task will trigger and also work as expected.

We have tried different configurations, with maxJobsActive, async vs sync process starting but nothing really changed this behaviour.

Do any of you have an idea what could cause our issues here ?
We also tried to seperate the transactional part, but that didn’t help us either. (https://camunda.com/blog/2023/12/navigating-technical-transactions-camunda-8-spring/)

Maybe linked to Zeebe Spring Client JobWorker needs up to 10 Minutes until gets active

Thank you for the support!
Best Regards

Additionally the spring boot config we use:

camunda:
  client:
    mode: simple
    zeebe:
      enabled: true
      gateway-url: http://localhost:26500
      base-url: http://localhost:9600

docker-compose for zeebe:

  zeebe: # https://docs.camunda.io/docs/self-managed/platform-deployment/docker/#zeebe
    image: camunda/zeebe:${CAMUNDA_PLATFORM_VERSION}
    container_name: zeebe
    ports:
      - "26500:26500"
      - "26501:26501"
      - "26502:26502"
      - "8183:8080"
      - "9600:9600"
    environment: # https://docs.camunda.io/docs/self-managed/zeebe-deployment/configuration/environment-variables/
      - ZEEBE_BROKER_EXPORTERS_ELASTICSEARCH_CLASSNAME=io.camunda.zeebe.exporter.ElasticsearchExporter
      - ZEEBE_BROKER_EXPORTERS_ELASTICSEARCH_ARGS_URL=http://elasticsearch:9200
      # default is 1000, see here: https://github.com/camunda/zeebe/blob/main/exporters/elasticsearch-exporter/src/main/java/io/camunda/zeebe/exporter/ElasticsearchExporterConfiguration.java#L259
      - ZEEBE_BROKER_EXPORTERS_ELASTICSEARCH_ARGS_BULK_SIZE=1
      # allow running with low disk space
      - ZEEBE_BROKER_DATA_DISKUSAGECOMMANDWATERMARK=0.998
      - ZEEBE_BROKER_DATA_DISKUSAGEREPLICATIONWATERMARK=0.999
     # tried playing around with message siz 
     #- ZEEBE_BROKER_NETWORK_MAXMESSAGESIZE=40MB
      - ZEEBE_LOG_LEVEL=debug
      - "JAVA_TOOL_OPTIONS=-Xms512m -Xmx512m"
    restart: always
    healthcheck:
      test:
        [
          "CMD-SHELL",
          "timeout 10s bash -c ':> /dev/tcp/127.0.0.1/9600' || exit 1"
        ]
      interval: 30s
      timeout: 5s
      retries: 5
      start_period: 30s

We see different logs, where we are not sure if that may be related to our issues with the job triggering:

2024-07-11 12:42:02 2024-07-11T10:42:02.072Z  WARN 1 --- [applications-service] [ault-executor-1] io.camunda.zeebe.client.job.poller       : Failed to activate jobs for worker userTaskService#handleUserTaskCreate and job type io.camunda.zeebe:userTask
2024-07-11 12:42:02
2024-07-11 12:42:02 io.grpc.StatusRuntimeException: INTERNAL: Encountered end-of-stream mid-frame
2024-07-11 12:42:02     at io.grpc.Status.asRuntimeException(Status.java:533) ~[grpc-api-1.62.2.jar:1.62.2]
2024-07-11 12:42:02     at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:481) ~[grpc-stub-1.62.2.jar:1.62.2]
2024-07-11 12:42:02     at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:574) ~[grpc-core-1.62.2.jar:1.62.2]
2024-07-11 12:42:02     at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:72) ~[grpc-core-1.62.2.jar:1.62.2]
2024-07-11 12:42:02     at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:742) ~[grpc-core-1.62.2.jar:1.62.2]
2024-07-11 12:42:02     at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:723) ~[grpc-core-1.62.2.jar:1.62.2]
2024-07-11 12:42:02     at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) ~[grpc-core-1.62.2.jar:1.62.2]
2024-07-11 12:42:02     at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133) ~[grpc-core-1.62.2.jar:1.62.2]
2024-07-11 12:42:02     at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[na:na]
2024-07-11 12:42:02     at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[na:na]
2024-07-11 12:42:02     at java.base/java.lang.Thread.run(Thread.java:1583) ~[na:na]

another one:

2024-07-11T16:45:17.581Z  WARN 1 --- [applications-service] [pool-2-thread-1] io.camunda.zeebe.client.job.worker       : Failed to stream jobs of type 'io.camunda.zeebe:userTask' to worker 'userTaskService#handleUserTaskCreate'

io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: deadline exceeded after 28799.998935685s. Name resolution delay 0.000000000 seconds. [closed=[], committed=[remote_addr=zeebe-gateway/172.30.244.96:26500]]
	at io.grpc.Status.asRuntimeException(Status.java:533) ~[grpc-api-1.62.2.jar:1.62.2]
	at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:481) ~[grpc-stub-1.62.2.jar:1.62.2]
	at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:574) ~[grpc-core-1.62.2.jar:1.62.2]
	at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:72) ~[grpc-core-1.62.2.jar:1.62.2]
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:742) ~[grpc-core-1.62.2.jar:1.62.2]
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:723) ~[grpc-core-1.62.2.jar:1.62.2]
	at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) ~[grpc-core-1.62.2.jar:1.62.2]
	at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133) ~[grpc-core-1.62.2.jar:1.62.2]
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[na:na]
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[na:na]
	at java.base/java.lang.Thread.run(Thread.java:1583) ~[na:na]

another one

2024-07-09T13:10:23.386+02:00 DEBUG 59947 --- [applications-service] [ault-executor-1] io.camunda.zeebe.client.job.worker       : Failed to activate jobs due to RESOURCE_EXHAUSTED: gRPC message exceeds maximum size 4194304: 125632512, delay retry for 153 ms

another one (not sure if thats an issue, since it logs on INFO):

2024-07-11 12:47:47 2024-07-11T10:47:47.741Z  INFO 1 --- [applications-service] [-worker-ELG-1-4] io.grpc.internal.AbstractClientStream    : Received trailers on closed stream:

Any Ideas ?

Thanks to this github issue we were able to resolve it:

Thanks for your support!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.