Hi all,
the problem is solved finally. In our docker containers job executor was deployment aware. Thats why they wont be selected from JobExector. In somehow they work at the beginning of bpmn deployment but after some time no more and they stuck in waiting.
This issue gave me the idea to try it out: Timers doesnt work after server restart - #13 by Webcyberrob
Here is the config change we did in the bpm-plattform xml file in the container:
<process-engine name="default">
<properties>
<property name="jobExecutorDeploymentAware">false</property>
<property name="jobExecutorActivate">true</property>
</properties>
I can not explain how the selection internally works with deployment id but here are the sqls comparision from my case.
– Before the config change…
Preparing: select RES.ID_, RES.REV_, RES.DUEDATE_, RES.PROCESS_INSTANCE_ID_, RES.EXCLUSIVE_ from ACT_RU_JOB RES where (RES.RETRIES_ > 0) and ( RES.DUEDATE_ is null or RES.DUEDATE_ <= ? ) and (RES.LOCK_OWNER_ is null or RES.LOCK_EXP_TIME_ < ?) and RES.SUSPENSION_STATE_ = 1 and (RES.DEPLOYMENT_ID_ is null ) and ( ( RES.EXCLUSIVE_ = true and not exists( select J2.ID_ from ACT_RU_JOB J2 where J2.PROCESS_INSTANCE_ID_ = RES.PROCESS_INSTANCE_ID_ – from the same proc. inst. and (J2.EXCLUSIVE_ = true) – also exclusive and (J2.LOCK_OWNER_ is not null and J2.LOCK_EXP_TIME_ >= ?) – in progress ) ) or RES.EXCLUSIVE_ = false ) LIMIT ? OFFSET ?
– After the config change…
Preparing: select RES.ID_, RES.REV_, RES.DUEDATE_, RES.PROCESS_INSTANCE_ID_, RES.EXCLUSIVE_ from ACT_RU_JOB RES where (RES.RETRIES_ > 0) and ( RES.DUEDATE_ is null or RES.DUEDATE_ <= ? ) and (RES.LOCK_OWNER_ is null or RES.LOCK_EXP_TIME_ < ?) and RES.SUSPENSION_STATE_ = 1 and ( ( RES.EXCLUSIVE_ = true and not exists( select J2.ID_ from ACT_RU_JOB J2 where J2.PROCESS_INSTANCE_ID_ = RES.PROCESS_INSTANCE_ID_ – from the same proc. inst. and (J2.EXCLUSIVE_ = true) – also exclusive and (J2.LOCK_OWNER_ is not null and J2.LOCK_EXP_TIME_ >= ?) – in progress ) ) or RES.EXCLUSIVE_ = false ) LIMIT ? OFFSET ?
Only the diffrence is (RES.DEPLOYMENT_ID_ is null ) used in sql of Deployment aware job executors. If the jobExecutorDeploymentAware is false the all jobs will be seleced and aquised now. It helped also after restart of camunda to execute the any jobs again. It was not the case before the config change.