is there a way to kill a process instance (and all possible threads it may spawn) programmatically via the Java API in case the BPMN consists of infinite loops without wait state, like in the ones attached?
The first one (infiniteLoopInsideTask.bpmn) will run forever as the groovy script contains an endless loop.
The second one (infiniteLoopOutsideTask.bpmn) will run forever as it contains a loop via the modelled sequences.
I know that there are ways to terminate a process instance (via RuntimeService.deleteProcessInstance() or via attaching a terminate end event, but they only have an effect in case the execution is in some form of wait state (and not currently processing anything). I tried both solutions with the same outcome - the threads keep on running as there is no checkpoint in between where the job executor could switch to another job. It’s like an ordinary thread running in an endless loop without any exit criterion (and which is also not listening to interruptions). So the only way I see to terminate it, would be to terminate the entire operator system process where the BPMN process is running.
It basically is a similar issue to the ticket described here. However, we have the following constraints:
We cannot modify the executed BPMN by adding wait states (as the BPMN files gets uploaded and should not be in our control). So what we are basically trying to achieve is to execute a BPMN in a kind of sandbox. In case the entire process takes too long, the process shall get terminated (without any thread leaks).
thank you for your quick responses. I really appreciate it.
Yes, there is no way of terminating a thread in a safe manner when it is not explicitly listening for interrupts.
Since I am not in control of the BPMN (it gets uploaded by a third party), I would need to parse the file and hook in timeouts. As this is not a failsave solution either, the only way I see is to run it in a separate JVM.
I was asking because I don’t know how Camunda is translating the BPMN to code. So for infiniteLoopOutsideTask.bpmn I was hoping that maybe after every sequence (or task) the generated code is checking whether the current thread got interrupted and if so, stop execution. I guess this feature would actually not be so hard to implement (and could be configurable per process definition whether it shall be activated).
I extended infiniteLoopOutsideTask.bpmn (see attached version 2), which contains this check inside the loop: The CheckIfInterruptedServiceTask consists of this code:
if (Thread.currentThread().isInterrupted()) {
throw new InterruptedException("got interrupted");
}
So when the process instance execution takes longer than a certain amount of time, I interrupt the thread where the execution takes place. This gets recogniced by the process which then exits (by e.g. throwing an Exception).
Could you imagine adding this feature in Camunda?
This solution would have no effect for infiniteLoopInsideTask.bpmn of course and cannot be solved at all I guess. But since uploading BPMNs with Script tasks is actually not necessary in our use case, we could simply parse the file first and reject it in case Script tasks are used. So I basically only have a problem with infiniteLoopOutsideTask.bpmn.
thanks for clarifying how the engine works.
Your solution worked out. Thank you!
We also allow to upload DMN files, so here I have no code where I can execute the ensureThreadNotTimedOut() method. But I simply adapted your provided TimeoutInterceptor by executing the ensureThreadNotTimedOut() method in the finally clause. This works out fine.
I am having a similar problem.
Sometimes, our designed BPMN flow has endless loops in it. Loop in such a way that it just moves from one gate to another gate in an endless loop. That’s obviously due to the bad bpmn design. Such cases have sometimes brought the whole Camunda down because it fills up the Postgres Database 100% very quickly.
We want to be able to detect such stuck process instances before it brings down our Camunda. As I understand, I can’t use this interceptor cause it will only work if its executed inside a task. However, as mentioned above, sometimes we get endless loops that just moves from one gate to another in a circle.