package com.demo.adapt;
import java.time.Duration;
import javax.annotation.PostConstruct;
import javax.annotation.PreDestroy;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Component;
import io.camunda.zeebe.client.ZeebeClient;
import io.camunda.zeebe.client.api.response.ActivatedJob;
import io.camunda.zeebe.client.api.worker.JobClient;
import io.camunda.zeebe.client.api.worker.JobHandler;
import io.camunda.zeebe.client.api.worker.JobWorker;
import lombok.extern.slf4j.Slf4j;
@Slf4j
@Component
public class TestWorker implements JobHandler {
@Autowired private ZeebeClient client;
private JobWorker worker;
@PostConstruct
public void register() {
worker = client
.newWorker()
.jobType("test001")
.handler(this)
.name("test001")
.maxJobsActive(1).timeout(Duration.ofSeconds(6)).open();
log.info("Job worker test001 opened and receiving jobs");
}
@Override
public void handle(JobClient client, ActivatedJob job) throws Exception {
System.err.println("currentTime:" + System.currentTimeMillis());
Thread.sleep(50000);
client.newCompleteCommand(job.getKey()).send();
}
@PreDestroy
public void unregister() {
if (!worker.isClosed()) {
worker.close();
log.info("test001 Job worker closed");
}
}
}
May I ask why a timeout of 6 seconds has been configured, why the job has not given up resources and is still blocking, and other jobs cannot come in?
Hello @bugbugmaker ,
the reason is that by default, the zeebe client runs with a single-threaded threadpool.
This can be configured but in general, it should give you the chance to write unblocking functions using async in java.
I hope this helps
Jonathan
Can you give me more information? I don’t know how to achieve it
Also, I would like to know the effect of the timeout field
Hello @bugbugmaker ,
you could use an executorService, submit an execution and use the returned future.
For more information, you read here: https://www.baeldung.com/java-asynchronous-programming
The timeout is related to the job in zeebe. When the timeout hits in, the job will be available again for execution.
Jonathan
Hello, I understand the meaning of asynchronous invocation. What I want to clarify is the actual function of worker timeout. I thought it was when the timeout was reached and this job was not completed, the worker resources would be released to execute other jobs
Hello @bugbugmaker ,
in this case, that is your answer:
However, your suggestion could make sense, but would have to be implemented on your side.
Jonathan
Okay, thank you. May I ask what caused it? This causes zeebe to consume a high amount of CPU.
"http-nio-0.0.0.0-9600-Acceptor" #41 daemon prio=5 os_prio=0 cpu=0.42ms elapsed=52292.65s tid=0x00007fe7a535b9a0 nid=0x4f runnable [0x00007fe7151f4000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.Net.accept(java.base@17.0.4.1/Native Method)
at sun.nio.ch.ServerSocketChannelImpl.implAccept(java.base@17.0.4.1/Unknown Source)
at sun.nio.ch.ServerSocketChannelImpl.accept(java.base@17.0.4.1/Unknown Source)
at org.apache.tomcat.util.net.NioEndpoint.serverSocketAccept(NioEndpoint.java:546)
at org.apache.tomcat.util.net.NioEndpoint.serverSocketAccept(NioEndpoint.java:79)
at org.apache.tomcat.util.net.Acceptor.run(Acceptor.java:129)
at java.lang.Thread.run(java.base@17.0.4.1/Unknown Source)
"Broker-0-zb-actors-0" #42 prio=5 os_prio=0 cpu=43198018.36ms elapsed=52292.44s tid=0x00007fe7a531ed10 nid=0x50 runnable [0x00007fe7150f2000]
java.lang.Thread.State: RUNNABLE
at io.camunda.zeebe.logstreams.impl.log.LogStorageAppender.appendBlock(LogStorageAppender.java:121)
at io.camunda.zeebe.logstreams.impl.log.LogStorageAppender.onWriteBufferAvailable(LogStorageAppender.java:196)
at io.camunda.zeebe.logstreams.impl.log.LogStorageAppender$$Lambda$1556/0x00000008014253e8.run(Unknown Source)
at io.camunda.zeebe.util.sched.ActorJob.invoke(ActorJob.java:74)
at io.camunda.zeebe.util.sched.ActorJob.execute(ActorJob.java:42)
at io.camunda.zeebe.util.sched.ActorTask.execute(ActorTask.java:125)
at io.camunda.zeebe.util.sched.ActorThread.executeCurrentTask(ActorThread.java:97)
at io.camunda.zeebe.util.sched.ActorThread.doWork(ActorThread.java:80)
at io.camunda.zeebe.util.sched.ActorThread.run(ActorThread.java:189)
"Broker-0-zb-actors-1" #43 prio=5 os_prio=0 cpu=36769234.13ms elapsed=52292.44s tid=0x00007fe7a531f880 nid=0x51 waiting on condition [0x00007fe714ff2000]
java.lang.Thread.State: WAITING (parking)
at jdk.internal.misc.Unsafe.park(java.base@17.0.4.1/Native Method)
- parking to wait for <0x00000007364b8400> (a java.util.concurrent.CompletableFuture$Signaller)
at java.util.concurrent.locks.LockSupport.park(java.base@17.0.4.1/Unknown Source)
at java.util.concurrent.CompletableFuture$Signaller.block(java.base@17.0.4.1/Unknown Source)
at java.util.concurrent.ForkJoinPool.unmanagedBlock(java.base@17.0.4.1/Unknown Source)
at java.util.concurrent.ForkJoinPool.managedBlock(java.base@17.0.4.1/Unknown Source)
at java.util.concurrent.CompletableFuture.waitingGet(java.base@17.0.4.1/Unknown Source)
at java.util.concurrent.CompletableFuture.join(java.base@17.0.4.1/Unknown Source)
at io.camunda.zeebe.broker.logstreams.LogDeletionService.delegateDeletion(LogDeletionService.java:67)
at io.camunda.zeebe.broker.logstreams.LogDeletionService.lambda$onNewSnapshot$2(LogDeletionService.java:59)
at io.camunda.zeebe.broker.logstreams.LogDeletionService$$Lambda$2091/0x00000008016b5758.run(Unknown Source)
at io.camunda.zeebe.util.sched.ActorJob.invoke(ActorJob.java:72)
at io.camunda.zeebe.util.sched.ActorJob.execute(ActorJob.java:42)
at io.camunda.zeebe.util.sched.ActorTask.execute(ActorTask.java:125)
at io.camunda.zeebe.util.sched.ActorThread.executeCurrentTask(ActorThread.java:97)
at io.camunda.zeebe.util.sched.ActorThread.doWork(ActorThread.java:80)
at io.camunda.zeebe.util.sched.ActorThread.run(ActorThread.java:189)
"Broker-0-zb-fs-workers-0" #44 prio=5 os_prio=0 cpu=700565.30ms elapsed=52292.44s tid=0x00007fe7a5321de0 nid=0x52 runnable [0x00007fe714ef1000]
java.lang.Thread.State: TIMED_WAITING (parking)
at jdk.internal.misc.Unsafe.park(java.base@17.0.4.1/Native Method)
at java.util.concurrent.locks.LockSupport.parkNanos(java.base@17.0.4.1/Unknown Source)
at org.agrona.concurrent.BackoffIdleStrategy.idle(BackoffIdleStrategy.java:214)
at io.camunda.zeebe.util.sched.ActorThread$ActorTaskRunnerIdleStrategy.onIdle(ActorThread.java:267)
at io.camunda.zeebe.util.sched.ActorThread.doWork(ActorThread.java:85)
at io.camunda.zeebe.util.sched.ActorThread.run(ActorThread.java:189)
"Broker-0-zb-fs-workers-1" #45 prio=5 os_prio=0 cpu=700968.45ms elapsed=52292.44s tid=0x00007fe7a53229b0 nid=0x53 runnable [0x00007fe714df0000]
java.lang.Thread.State: TIMED_WAITING (parking)
at jdk.internal.misc.Unsafe.park(java.base@17.0.4.1/Native Method)
at java.util.concurrent.locks.LockSupport.parkNanos(java.base@17.0.4.1/Unknown Source)
at org.agrona.concurrent.BackoffIdleStrategy.idle(BackoffIdleStrategy.java:214)
at io.camunda.zeebe.util.sched.ActorThread$ActorTaskRunnerIdleStrategy.onIdle(ActorThread.java:267)
at io.camunda.zeebe.util.sched.ActorThread.doWork(ActorThread.java:85)
at io.camunda.zeebe.util.sched.ActorThread.run(ActorThread.java:189)
"atomix-cluster-0" #46 prio=5 os_prio=0 cpu=35.80ms elapsed=52290.69s tid=0x00007fe6980ad740 nid=0x59 waiting on condition [0x00007fe7756d8000]
java.lang.Thread.State: WAITING (parking)
at jdk.internal.misc.Unsafe.park(java.base@17.0.4.1/Native Method)
- parking to wait for <0x00000007013a20f0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(java.base@17.0.4.1/Unknown Source)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(java.base@17.0.4.1/Unknown Source)
at java.util.concurrent.ForkJoinPool.unmanagedBlock(java.base@17.0.4.1/Unknown Source)
at java.util.concurrent.ForkJoinPool.managedBlock(java.base@17.0.4.1/Unknown Source)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(java.base@17.0.4.1/Unknown Source)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(java.base@17.0.4.1/Unknown Source)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(java.base@17.0.4.1/Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.getTask(java.base@17.0.4.1/Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@17.0.4.1/Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@17.0.4.1/Unknown Source)
at java.lang.Thread.run(java.base@17.0.4.1/Unknown Source)
"netty-messaging-event-epoll-server-0" #47 prio=5 os_prio=0 cpu=8.88ms elapsed=52290.19s tid=0x00007fe698509750 nid=0x5a runnable [0x00007fe7758da000]
java.lang.Thread.State: RUNNABLE
at io.netty.channel.epoll.Native.epollWait(Native Method)
at io.netty.channel.epoll.Native.epollWait(Native.java:209)
at io.netty.channel.epoll.Native.epollWait(Native.java:202)
at io.netty.channel.epoll.EpollEventLoop.epollWaitNoTimerChange(EpollEventLoop.java:294)
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:351)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:995)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at java.lang.Thread.run(java.base@17.0.4.1/Unknown Source)
Hello @bugbugmaker ,
I am not sure where the provided logs come from?
Jonathan
Hello @bugbugmaker ,
and what caused this high load?
Jonathan
I couldn’t find the specific cause, so I sent out the stack information of Zeebe. Can you help me see what caused it?
Hello @bugbugmaker ,
could it be related to your processes? Do you have a huge multi-instance or kick off many process instances at once (basically things that put an intermediate extremely high load on the engine)?
Jonathan
Attempting to disconnect all zeeeb clients will still result in high CPU. There is no way to lower it, only by deleting the data data in the zeebe directory. Can you tell what caused it through the stack information?
Hello @bugbugmaker ,
I am afraid I cannot tell you what caused it. Could you please share the process model that was executed?
Jonathan
It is very difficult to locate which process is causing it. I have many processes that may experience high CPU usage after running for a period of time. Moreover, restarting Zeebe and disconnecting the client cannot solve the problem. The Zeebe service stack information is as described above
Hello @bugbugmaker ,
in the end, the process models are like the code that is running:
It will probably cause issues. Some of the main causes could be live-locks (infinite loops), multi-instances or also recursion.
If your processes have any of these patterns in place, they would potentially be the cause. Then, we could help you to optimize them. What do you think?
Jonathan
The high CPU is caused by Zeebe running for a period of time, and it is suspected that some data is causing this issue. It is necessary to delete the query file in the data directory to restore normal operation. What I want to ask is how to know what is causing the problem, and how to avoid it. If this problem occurs, how to restore normal without deleting the data directory data
Hello @bugbugmaker ,
this would be easier with BPMN processes that are running. Usually, this does not happen.
Also, it would be helpful to know how much diskspace you gave zeebe.
Jonathan
There is still a lot of disk space left