Kill process instance when running in an infinite loop without wait state

Hello,

is there a way to kill a process instance (and all possible threads it may spawn) programmatically via the Java API in case the BPMN consists of infinite loops without wait state, like in the ones attached?

The first one (infiniteLoopInsideTask.bpmn) will run forever as the groovy script contains an endless loop.
The second one (infiniteLoopOutsideTask.bpmn) will run forever as it contains a loop via the modelled sequences.

I know that there are ways to terminate a process instance (via RuntimeService.deleteProcessInstance() or via attaching a terminate end event, but they only have an effect in case the execution is in some form of wait state (and not currently processing anything). I tried both solutions with the same outcome - the threads keep on running as there is no checkpoint in between where the job executor could switch to another job. It’s like an ordinary thread running in an endless loop without any exit criterion (and which is also not listening to interruptions). So the only way I see to terminate it, would be to terminate the entire operator system process where the BPMN process is running.

It basically is a similar issue to the ticket described here. However, we have the following constraints:

We cannot modify the executed BPMN by adding wait states (as the BPMN files gets uploaded and should not be in our control). So what we are basically trying to achieve is to execute a BPMN in a kind of sandbox. In case the entire process takes too long, the process shall get terminated (without any thread leaks).

Is there a way to achieve this?infiniteLoopInsideTask.bpmn (3.5 KB)

Here is the second example (as I am only allowed to attach 1 file per post): infiniteLoopOutsideTask.bpmn (4.8 KB)

There is no such feature. Is there actually any way of doing this in a plain (non-Camunda) Java program?

Hi @gugumonster,

as @thorben mentioned, I don’t think there is safe possibility to stop thread in Java in general, please refer to
https://docs.oracle.com/javase/8/docs/api/java/lang/Thread.html#stop--

you should build timeout into your script I assume, that would be cleanest way. Is that an option for you?

Cheers,
Askar

Hey,

thank you for your quick responses. I really appreciate it.
Yes, there is no way of terminating a thread in a safe manner when it is not explicitly listening for interrupts.

Since I am not in control of the BPMN (it gets uploaded by a third party), I would need to parse the file and hook in timeouts. As this is not a failsave solution either, the only way I see is to run it in a separate JVM.

I was asking because I don’t know how Camunda is translating the BPMN to code. So for infiniteLoopOutsideTask.bpmn I was hoping that maybe after every sequence (or task) the generated code is checking whether the current thread got interrupted and if so, stop execution. I guess this feature would actually not be so hard to implement (and could be configurable per process definition whether it shall be activated).

I extended infiniteLoopOutsideTask.bpmn (see attached version 2), which contains this check inside the loop: The CheckIfInterruptedServiceTask consists of this code:

if (Thread.currentThread().isInterrupted()) {
   throw new InterruptedException("got interrupted");
}

So when the process instance execution takes longer than a certain amount of time, I interrupt the thread where the execution takes place. This gets recogniced by the process which then exits (by e.g. throwing an Exception).
Could you imagine adding this feature in Camunda?

This solution would have no effect for infiniteLoopInsideTask.bpmn of course and cannot be solved at all I guess. But since uploading BPMNs with Script tasks is actually not necessary in our use case, we could simply parse the file first and reject it in case Script tasks are used. So I basically only have a problem with infiniteLoopOutsideTask.bpmn.

infiniteLoopOutsideTaskV2.bpmn (5.7 KB)

1 Like

Hi,

Just to provide some basic understanding: Camunda does not compile a BPMN model or generates code. It interprets the model.

Regarding your actual question, it should be fairly easy to build such a timeout facility yourself and plug it into the engine:

Write a command interceptor that keeps track of the time:

package org.camunda.bpm.unittest;

import org.camunda.bpm.engine.impl.interceptor.Command;
import org.camunda.bpm.engine.impl.interceptor.CommandInterceptor;

public class TimeoutInterceptor extends CommandInterceptor {

  protected static final long TIMEOUT_MILLIS = 5 * 60 * 1000;
  protected static ThreadLocal<Long> commandBeginTime = new ThreadLocal<Long>();

  public <T> T execute(Command<T> cmd) {

    boolean recordTime = commandBeginTime.get() == null;
    if (recordTime) {
      commandBeginTime.set(System.currentTimeMillis());
    }

    try {
      return next.execute(cmd);
    } finally {
      if (recordTime) {
        commandBeginTime.set(null);
      }
    }
  }

  public static void ensureThreadNotTimedOut() {
    long currentTimeMillis = System.currentTimeMillis();
    long startTime = commandBeginTime.get();

    if (currentTimeMillis - startTime > TIMEOUT_MILLIS) {
      throw new RuntimeException("timeout");
    }
  }
}

Register the interceptor with the engine:

package org.camunda.bpm.unittest;

import java.util.Arrays;

import org.camunda.bpm.engine.ProcessEngine;
import org.camunda.bpm.engine.impl.cfg.ProcessEngineConfigurationImpl;
import org.camunda.bpm.engine.impl.cfg.ProcessEnginePlugin;
import org.camunda.bpm.engine.impl.interceptor.CommandInterceptor;

public class TimeoutPlugin implements ProcessEnginePlugin {

  public void postInit(ProcessEngineConfigurationImpl config) {
  }

  public void postProcessEngineBuild(ProcessEngine config) {
  }

  public void preInit(ProcessEngineConfigurationImpl config) {
    TimeoutInterceptor timeoutInterceptor = new TimeoutInterceptor();
    config.setCustomPreCommandInterceptorsTxRequired(Arrays.<CommandInterceptor>asList(timeoutInterceptor));
    config.setCustomPreCommandInterceptorsTxRequiresNew(Arrays.<CommandInterceptor>asList(timeoutInterceptor));
  }
}

Then use TimeoutInterceptor#ensureThreadNotTimedOut from any code that is called by the process engine as you like.

Cheers,
Thorben

2 Likes

Hey,

thanks for clarifying how the engine works.
Your solution worked out. Thank you!

We also allow to upload DMN files, so here I have no code where I can execute the ensureThreadNotTimedOut() method. But I simply adapted your provided TimeoutInterceptor by executing the ensureThreadNotTimedOut() method in the finally clause. This works out fine.

Thank you very much for your help!

I am having a similar problem.
Sometimes, our designed BPMN flow has endless loops in it. Loop in such a way that it just moves from one gate to another gate in an endless loop. That’s obviously due to the bad bpmn design. Such cases have sometimes brought the whole Camunda down because it fills up the Postgres Database 100% very quickly.
We want to be able to detect such stuck process instances before it brings down our Camunda. As I understand, I can’t use this interceptor cause it will only work if its executed inside a task. However, as mentioned above, sometimes we get endless loops that just moves from one gate to another in a circle.