How are failures in Asynchronous Continuations for Service tasks handled

Yusuf · October 21, 2024, 10:48am

in the BPMN, we are having a Service Task which is internally invoking a downstream application.
After we are recieving a succesful response from the downstream application, there seems to be an OLE which updating the process variables and the Service task is invoked again.

This issue is being encountered under stress test only. Can you please suggest how Camunda is handling failures encountered while saving the db state when the asyncBefore is set for the service task

fml2 · October 21, 2024, 4:58pm

What is OLE?

Why is this a problem for you? Is the service not an idempotent one? Then set the retry count to zero.

Yusuf · October 21, 2024, 6:30pm

Hi @fml2, thanks for your interest. I was referring to optimistic lock exceptions.

In our Camunda setup, the downstream application is invoked asynchronously. As part of the business validations, it updates the process variables for the instance via a REST API.

Under load, this REST API call seems to be made before the async continuation job, which is managed by the thread pool, is executed. This job is scheduled for the Receive Task to handle the async response from the downstream application.

As a result, we are attempting to update a process variable that has not yet been committed, leading to an optimistic lock exception. Consequently, the process retries from the last save point.

fml2 · October 21, 2024, 6:53pm

Do you directly update a variable via the REST API (which one?)? Or do you send (correlate) a message (along with some variables) to the instance so that the variables get integrated into the process?

I’m asking because depending on the situation, the solution could be simple or … not that simple (in my view at least).

In general, I’d not assume a certain behaviour on opt. lock. exc. and assume that the data is lost (rolled back).

Yusuf · October 22, 2024, 6:10am

We are maintaining a FlowContext resource in the process variables for each instance.

This has all of the business data relevant to the current flow. From the downstream application, we are invoking the rest API to update this resource using this API:
https://docs.camunda.org/rest/camunda-bpm-platform/7.22/#tag/Process-Instance/operation/modifyProcessInstanceVariables

This is to update the variables that are changed as part of the business process.

OLE seems to be encountered during this rest call itself.

After downstream process is handled, we will also send a message for the async step completion.

fml2 · October 22, 2024, 8:35am

Thank you for the clarification. I’ve never used that API and can’t tell how it interacts with the engine.

But why do you do it in two steps? Why don’t you send the new value(s) along with the event?

Yusuf · October 22, 2024, 9:03am

our use case for this is that the downstream application invokes shipping, and it sends intermediate responses about the status of our shippingOrder.

This status is relevant to our other applications who will perform business logic as per the change.

Once the order is delivered, we can send the AsyncResponse to complete the the receive task.

What I suspect the issue to be is that variable update is being processed before the async-continuation job is picked up by the job executor to commit the db changes.

GotnOGuts · October 22, 2024, 3:55pm

Could you post the BPMN diagram?
I think that how you have things set up, you are causing your own OpLock Exception based on how Camunda7 does DB Transactions.