Process instance incidents and synchronous service tasks

Cristian · May 28, 2021, 11:33pm

Hi there,

By reading Camunda documentation on incidents, I did not feel I am fully clear about how incidents creation and handling works. I would like to get some clarification about a less touched aspect of incidents. That is when talking of an incident, are we talking about a wait state in the process where we are stuck because of something having gone wrong (ie, the incident effect on the process execution) or about the element of process which caused it to be stuck in a wait state (ie, the incident cause). In concrete terms, imagine we have a process with one user task followed by a service task. When attempting to complete user task, an exception could occur not only while executing user task logic, but also while executing the service task, in which case the cause of incident is the service task logic. Would we say there was an incident on user task (as an effect) or on service task (as a cause) in this case? I would sure hope Camunda process engine would create an incident via some registered incident handler, but the only ones it supports are of type failedJob and failedExternalTask. I tried to register another kind of custom incident handler, but it wasn’t invoked at all on service task exception. I assume either I am not doing it right or maybe there is a reason for this, so could anyone clarify, please? I know I could make the service task asynchronous and reduce it to a job, but I am not satisfied with this approach. It makes sense in my mind to be able to generate incidents for any kind of process instance failure (except for failing to start it, of course). After all, that’s how it is defined in the documentation.

Thank you,

Cristian

Niall · May 31, 2021, 10:15am

Hi @Cristian
Welcome to the forum. Let me try to explain why the following is happening

Firstly - let me clarify a really important point about Incidents that will make it clear why adding Async is important for incidents to be triggered.
Incidents are only triggered when an error happens on a thread that is owned by the engine.
So, when a user task is being completed, it’s completed by a thread emanating from the end user and so when the error occurs it rolls back to where the tread started (the user’s front end) the engine doesn’t get the opportunity to intercept this and deal with it as an incident.

Amount other things adding asynchronous before/after will complete the current thread and the engine itself will pick up where it left off and continue running the process itself. So in this case when error occurs it rolls back to the originator of the thread which is the engine itself, so it will then created the incident.

I hope that gives you some additional clarity.