How to correctly end a subprocess containing incidents?

andrius · August 23, 2023, 9:40am

I have a subprocess, which itself has a multiinstance subprocess, which sometimes generates multiple incidents.
I need a way to (manually) push this subprocess forward despite those incidents to perform remaining part of the process because I need cleanup steps to be performed.
My idea is to add an interrupting subprocess or a boundary event which correlate to a “Cancel” message and end the subprocess.
My questions:

will the incidents vanish automatically on the interrupting event or the event handling code has to handle them using API?
does API have a good way to determine if the process got stuck with incidents and will not execute forward? Currently I count incidents and jobs with retries left, but this looks more like hack to me.
This is a part of the subprocess I’d like to be able to “Cancel”:

image1278×690 80 KB

Any suggestions?

Niall · August 23, 2023, 11:44am

If you add a boundary event to the multi-instance sub process, it’ll actually cancel all running instances even ones that aren’t in error… so it might be a little risky

I would suggest using a signal event sub process inside the multi-instance sub process (thats not a fun sentence to say ) It would look like this:

The reason to use a signal is because you can send one request that can be picked up by X Instances.

andrius · August 23, 2023, 1:58pm

Thank you so much!

it’ll actually cancel all running instances even ones that aren’t in error…

It’s completely fine with me. Actually, I have two independent feature requests:

cancel if it’s stuck with incidents
– for that I’d like to know if it’s really stuck
cancel if it suddenly became clear we do not need to send remaining messages.

It seems that for both usecases it’s desirable to cancel everything.

Niall · August 23, 2023, 2:46pm

What you could also do is that instead of letter the tasks throw and incident you can have them throw a BPMN error and then change the signal event sub process to an error event sub process. That would clear up all those errors

andrius · August 23, 2023, 3:00pm

Probably I did not understand your last suggestion well.
If you meant implementing automatic error handler using error event subprocess, that’s probably not what we need. In our case, Incidents are technical errors whose causes and fixes are not clear and also not to be ignored.

Maybe those were caused by some known outage of some subsystem. Now it’s alive and it’s save to retry all multi instances using “Increase number of retries”.
Maybe it took longer to discover or fix the problem and it does not make sense to send outdated messages anymore. In that case Camunda UI does not provide a button like “Cancel this activity but continue with the next one”. This is what we seemingly need.

andrius · August 23, 2023, 3:12pm

A bit afraid to abuse your generosity, but two more questions:

if I add a boundary event to the parent of the multi-instance subprocess, will it still cancel all small multis?
will execution end listener on that parent still execute on (before? after?) correlation of that interruptible message event?

Niall · August 24, 2023, 6:56am

In this case my suggestion would not be a good fit

Don’t worry at all - i’m always happy to help when i can.

Yes. The boundary event has no concept that the task has multiple instances. As far an interrupting boundary event is concerted once it’s triggered that task (and all related instances or sub proceses linked to that task) will be ended.

Good question… i don’t remember off the top of my head but i’m pretty sure they wont. The execution listener would only be activated if the task has been successfully completed and an interrupting boundary event would put it in a state of being Cancelled so probably not.

andrius · August 28, 2023, 11:18am

I implemented “canceling” as an interrupting boundary event on “NOTIFYING”, but I can’t get it working.
When I try to correlate the message with an empty set of variables, it fails with different exceptions like these:

org.camunda.bpm.engine.OptimisticLockingException: ENGINE-03005 Execution of 'DELETE VariableInstanceEntity[0408b596-456f-11ee-be54-72438b8b7a7d]' failed. Entity was updated by another transaction concurrently.
	at org.camunda.bpm.engine.impl.db.EnginePersistenceLogger.concurrentUpdateDbEntityException(EnginePersistenceLogger.java:141)
...
org.camunda.bpm.engine.OptimisticLockingException: ENGINE-03005 Execution of 'DELETE MessageEntity[08650b1c-456f-11ee-be54-72438b8b7a7d]' failed. Entity was updated by another transaction concurrently.
	at org.camunda.bpm.engine.impl.db.EnginePersistenceLogger.concurrentUpdateDbEntityException(EnginePersistenceLogger.java:141)

		MessageCorrelationBuilder correlationBuilder = runtimeService.createMessageCorrelation(message)
				.processInstanceBusinessKey(businessKey)
				.setVariables(variables);
		return correlationBuilder.correlateWithResult().getExecution();

Why does Camunda try to DELETE variable instance and message entities at the same time?

I admit that this process instance was not stuck, but it was busy doing its multi-instance batches (which include variable updates).

But anyway - is it possible to stop it by correlating interrupting boundary message?
Should I suspend the process instance?
But then it will not correlate I suspect…