Correlating message to running process instances

Hello

We have a problem with the correlation of messages. Most of the time the correlation works as expected but randomly it fails. Using Camunda 7.15.0, Spring Boot 2.4.11.

We have a simple process called customer-fetchercustomer-fetcher.bpmn (3.3 KB)

The process is used in another process via a call activity.

It calls a rest to an external service in getCustomerFromShopAdapter and then it should wait until we receive a message back. The message is usually received after 2-3 seconds from the external service and then we use the following code to correlate the message to an existing process instance. (messageType=ackGetCustomer)

camunda.getRuntimeService().createMessageCorrelation(message.getMessageType())
                    .processInstanceBusinessKey(message.getTraceId())
                    .setVariable(CustomerFetcherProcessConstants.VAR_CUSTOMER_SEARCH_RESPONSE, customerSearchResponse)
                    .correlate();

Even if there is one process instance waiting for the ackGetCustomer we get an org.camunda.bpm.engine.MismatchingMessageCorrelationException: Cannot correlate message ‘ackGetCustomer’: No process definition or execution matches the parameters exception and the process instance is still at the waiting point as in the attached image

If the external service resends the same message and we retry to correlate (going through the exactly same process) it works without problems.

Most of the times the correlation succeeds from the first try and sometimes it fails. We are not able to reproduce the issue as it seems to appear randomly.

What can be the issue?

Hello @alexh2084 ,

this sounds like a race condition to me… the returning message then would arrive before the engine has commited the wait state to the database.

Starting with Camunda Platform 7.16.0, we now have a endpoint process-instance/message-async that creates a job for correlating a message instead of handling it in the same thread.

https://docs.camunda.org/manual/latest/reference/rest/process-instance/post-correlate-message-async/

The job executor will then take care of correlating your message for you.

I hope this helps

Jonathan

Hello @jonathan.lukas
Thanks for the reply. I just did the update to Camunda 7.16.0 and replaced the correlation logic with:

ProcessInstanceQuery processInstanceQuery = camunda.getRuntimeService().createProcessInstanceQuery().processInstanceBusinessKey(message.getTraceId());
            camunda.getRuntimeService().createMessageCorrelationAsync(message.getMessageType())
                    .processInstanceQuery(processInstanceQuery)
                    .setVariable(CustomerFetcherProcessConstants.VAR_CUSTOMER_SEARCH_RESPONSE, customerSearchResponse)
                    .correlateAllAsync();

Hope I got it correctly. First tests look good.

Thank you for the suggestion!

2 Likes

Hi @jonathan.lukas,

that sounds very nice, I was not aware of this new feature. Could you possibly explain for how long such a correlation-job is active and what would happen, if it is executed but the respective process-instance is still not ready for correlation?

BR rnschk :slight_smile:

Hello @rnschk ,

this message correlation job will work like the correlateAll() command of the Java API, but executed by the Job Executor.

Jonathan

Hello @jonathan.lukas,
May I know why async correlate message can solve this problem?
As I expected, after async correlate message, job executor will pick up the job if the thread is available.
If the thread is available but the running process instances is still not complete, isn’t it will still arrive before the engine has commited the wait state to the database?

Thanks you
Ken

I think the best way to solve this race condition is to always have Send and Receive message events in parallel gateway. The you should mark the send event as async before and the receive must NOT be async before. This way youre telling camunda to create the receiver at the database and only send the message in the Next cycle, so youll never send a message without being prepared for the response.
I also like to mark my Send Events as Async After, so that it commits to database soon as the message arrived and then if any next steps fail, you wont need to receive the message again.

3 Likes

Hello @huichunheo ,

in addition to the answer of @Jean_Robert_Alves I would like to add that correlating a message async will be executed 3 times max until the message has been correlated. This retry mechanism should prevent race conditions, however if you want to be 100% sure your message arrives, please use the proposal @Jean_Robert_Alves made.

Jonathan