Process Stuck in Start

dscheinin · April 8, 2020, 4:41pm

Hi guys, I want to execute all my Processes in an asynchronous way.
So, to do that, I configured the Start Event as an async activity (see screenshot)
camunda.config
The thing is that most of the Instances (not all) get stuck in the Start Event as you can see in the second screenshot.
camunda.stuck

What I’m doing wrong?

Thanks in advanced,
Diego

dscheinin · April 8, 2020, 9:59pm

UPDATE:
If I run the jobs manually, most of them get completed.
Even though, still have process stuck.
For the manual execution I’m using this bash script:

    jobs_list_json=$(curl -s $CAMUNDA_API_PATH"job")

    for job_id in $(echo ${jobs_list_json} | jq -r '.[].id'); do
      job_definition_json=$(curl -s ${CAMUNDA_API_PATH}"job/"$job_id)
      echo Executing $job_id
      echo "For process " $job_definition_json
      echo "-->"
      curl --header 'Accept: application/json' --request POST ${CAMUNDA_API_PATH}"job/"${job_id}"/execute"

    done

But that is not an option due to the objective of the processes.

Regards,
Diego

Niall · April 9, 2020, 5:47am

What is your setup?
What Version of Camunda?
Upload your config.

dscheinin · April 9, 2020, 11:11am

Hi @Niall
I’m using a shared Camunda (v7.12) with the standard config. This shared Camunda is running in GCP with autoscaling. But right now, there is only one instance running.
I just changed the DB for a MySQL server.
I attached bpm-platform.xml and server.xml.

bpm-platform.xml (3.5 KB) server.modified.xml (7.8 KB)

Niall · April 9, 2020, 11:16am

How are you deploying process?
How are you implementing service tasks?
Are there any errors appearing the log after you start the process?

dscheinin · April 9, 2020, 11:46am

I’m deploying the process using the API Rest.
Regarding the services tasks, I don’t use any special config. I mean, all my service tasks are synchronous.
And unfortunately, I don’t find any error.
Just the process stuck at the start without any error.
Today I make a test with more of 10K process starts and it works fine.
My main concern is if the config is ok and in which cases I can get the process stuck at the start event.
Could be an error during the flow and because I have just the event start as Async before the process wait at that task?

Niall · April 9, 2020, 11:51am

The problem almost certainly is with the Job Executor not picking up the job - if there are indeed no errors in the log.
There could be a lot of possible issues and i’d suggest taking a look at the docs to find what might be causing it.

https://docs.camunda.org/manual/latest/user-guide/process-engine/the-job-executor/

dscheinin · April 9, 2020, 12:04pm

Thanks for the recommendation.
I’ll take a look and if I find something I’ll let you know

Ingo_Richtsmeier · April 9, 2020, 2:10pm

Hi @dscheinin,

I could reproduce your problem that the async continuation is not picked up after a server restart.

What I did:

Modeled a process instance with Asynchronous After on the start event
Deployed it with the rest api to the prepackaged shared process engine running on tomcat.
Start a process instance which works fine.
Restart the tomcat server
Start another process instance. This get stucked in the job.

Then I diagnosed the problem as described by Thorben here: https://blog.camunda.com/post/2019/10/job-executor-what-is-going-on-in-my-process-engine/.

The job is created in the database with a deployment id of d7eeb0ec-7a5e-11ea-ac1d-3ce1a1c19785.

After adding debug output for the job acquisition, I found this snippet in the log:

09-Apr-2020 14:56:46.207 FEIN [Thread-5] org.camunda.commons.logging.BaseLogger.logDebug ENGINE-13005 Starting command -------------------- AcquireJobsCmd ----------------------
09-Apr-2020 14:56:46.207 FEIN [Thread-5] org.camunda.commons.logging.BaseLogger.logDebug ENGINE-13009 opening new command context
09-Apr-2020 14:56:46.209 FEIN [Thread-5] org.apache.ibatis.logging.jdbc.BaseJdbcLogger.debug ==>  Preparing: select RES.ID_, RES.REV_, RES.DUEDATE_, RES.PROCESS_INSTANCE_ID_, RES.EXCLUSIVE_ from ACT_RU_JOB RES where (RES.RETRIES_ > 0) and ( RES.DUEDATE_ is null or RES.DUEDATE_ <= ? ) and (RES.LOCK_OWNER_ is null or RES.LOCK_EXP_TIME_ < ?) and RES.SUSPENSION_STATE_ = 1 and (RES.DEPLOYMENT_ID_ is null or ( RES.DEPLOYMENT_ID_ IN ( ? , ? ) ) ) and ( ( RES.EXCLUSIVE_ = 1 and not exists( select J2.ID_ from ACT_RU_JOB J2 where J2.PROCESS_INSTANCE_ID_ = RES.PROCESS_INSTANCE_ID_ -- from the same proc. inst. and (J2.EXCLUSIVE_ = 1) -- also exclusive and (J2.LOCK_OWNER_ is not null and J2.LOCK_EXP_TIME_ >= ?) -- in progress ) ) or RES.EXCLUSIVE_ = 0 ) LIMIT ? OFFSET ? 
09-Apr-2020 14:56:46.210 FEIN [Thread-5] org.apache.ibatis.logging.jdbc.BaseJdbcLogger.debug ==> Parameters: 2020-04-09 14:56:46.207(Timestamp), 2020-04-09 14:56:46.207(Timestamp), b2ae552f-6ad1-11ea-8375-3ce1a1c19785(String), b2e62e19-6ad1-11ea-8375-3ce1a1c19785(String), 2020-04-09 14:56:46.207(Timestamp), 3(Integer), 0(Integer)
09-Apr-2020 14:56:46.211 FEIN [Thread-5] org.apache.ibatis.logging.jdbc.BaseJdbcLogger.debug <==      Total: 0
09-Apr-2020 14:56:46.211 FEIN [Thread-5] org.camunda.commons.logging.BaseLogger.logDebug ENGINE-13011 closing existing command context
09-Apr-2020 14:56:46.212 FEIN [Thread-5] org.camunda.commons.logging.BaseLogger.logDebug ENGINE-13006 Finishing command -------------------- AcquireJobsCmd ----------------------

The crucial part of the query is and (RES.DEPLOYMENT_ID_ is null or ( RES.DEPLOYMENT_ID_ IN ( ? , ? ) ) ) with the parameters b2ae552f-6ad1-11ea-8375-3ce1a1c19785 and b2e62e19-6ad1-11ea-8375-3ce1a1c19785 which didn’t match the original one from above.

To overcome this issue, you could either use a different deployment model and deploy new process models with redeploying a process application as a war file: https://docs.camunda.org/get-started/java-process-app/

Or change the setting in the bpm-platform.xml for the job executor:

<property name="jobExecutorDeploymentAware">false</property>

But be aware that this has other implications on heterogenous cluster setup and process applications.

Hope this helps, Ingo

dscheinin · April 9, 2020, 3:39pm

@Ingo_Richtsmeier
That’s very useful.
I will check the infrastructure to check if I can change the bpm-platform.xml.

Thank you very much!

ironWil1 · June 14, 2023, 6:10pm

Hi @Ingo_Richtsmeier

I stucked with the same problem, how could i resolve it if I have heterogenous cluster setup with 2 Spring Boot deployment Camunda apps.
I’ve just resolve another problem by adding “deployment.aware=true”… But now i have new problem

Thank you in advance.

Ingo_Richtsmeier · June 15, 2023, 11:56am

Hi @ironWil1,

if you use jobExecutorDeploymentAware=true, don’t do a deployment from the modeler.

Always save your process model to the project and restart the spring boot application to deploy the diagram.

Hope this helps, Ingo

ironWil1 · June 15, 2023, 12:17pm

@Ingo_Richtsmeier, thank you for the answer.

I do so, I always deploy .bpmn scheme with spring-boot app. But I can avoid the problem only if I change anything in my scheme.bpmn file and in Cockpit I see then new ‘Definition version’.

But if I just changed anything only inside the code of the app and deploy it, then my ‘Definition Version’ would’t changed and then I would have the problem.
How could I change Definition Version with every app deploy without changing scheme.bpmn file (for now we change for example some event ID before redeploy and it is pretty annoying). Or maybe there anything else to change.

Thank you in advance!

Ingo_Richtsmeier · June 15, 2023, 12:45pm

Hi @ironWil1,

what kind of problem?

Usually the code is independent of the process model. Could you give an example of your change?

ironWil1 · June 15, 2023, 12:48pm

@Ingo_Richtsmeier oh sorry i didn’t make a link for your above answer.

The problem that my process would’t start after redeploying spring boot app with no changes in .bpmn file. I just make new app deployment to the cluster with any changes in the code (new docker file with my app and .bpmn scheme deploying to the k8s cluster)