History cleanup job run duration exceeds the lock-time-in-millis

Mohammad_Mahdavi · August 5, 2019, 4:09pm

I config the camunda clean up with below properties In two instance of camunda. I have some process instance that has an async task with 30k retry. Each retry has an exception that logged in act_ge_bytearray table that has 131m row. So when the clean up job want to remove this instance, execute a heavy query that has runtime greater than 10 minutes. Now when the duration of the current job exceeds the lock-time-in-millis, second job in second camunda instance start and want to do it again. Because of the first running query, second query blocked until first execution finished. In this situation the job executor in both instance is not responding.
I increase the lock-time-in-millis value to 80 mins. This config increase the risk of job execution in restarting inctance (because unfinished job with lock must wait abaut 40 mins to execute in new instance). Is there any solution for managing this situation?

My camunda 7.10 config:

 "camunda.bpm.generic-properties.properties.historyCleanupBatchSize": "10"
 "camunda.bpm.generic-properties.properties.historyCleanupBatchWindowEndTime": "07:00"
 "camunda.bpm.generic-properties.properties.historyCleanupBatchWindowStartTime": "01:00"
 "camunda.bpm.generic-properties.properties.historyCleanupStrategy": "endTimeBased"
 "camunda.bpm.job-execution.lock-time-in-millis": "4800000"
 "camunda.bpm.job-executor-acquire-by-priority": "true"

fml2 · August 6, 2019, 4:40am

I’m not a great expert in configuring camunda, but 30000 retries and 40 mins job lock time are far beyond the values I’d consider “normal” for a server side application (which camunda in most cases is). So I’d seek to significantly reduce those values.

Mohammad_Mahdavi · August 6, 2019, 5:58am

I incorrectly set the retry limit to infinitive for some task. So for handling the cleanup job for remaining process instance set the lock-time-in-millis to 80 mins. Thanks