We have a problem with the correlation of messages. Most of the time the correlation works as expected but randomly it fails. Using Camunda 7.15.0, Spring Boot 2.4.11.
The process is used in another process via a call activity.
It calls a rest to an external service in getCustomerFromShopAdapter and then it should wait until we receive a message back. The message is usually received after 2-3 seconds from the external service and then we use the following code to correlate the message to an existing process instance. (messageType=ackGetCustomer)
Even if there is one process instance waiting for the ackGetCustomer we get an org.camunda.bpm.engine.MismatchingMessageCorrelationException: Cannot correlate message ‘ackGetCustomer’: No process definition or execution matches the parameters exception and the process instance is still at the waiting point as in the attached image
If the external service resends the same message and we retry to correlate (going through the exactly same process) it works without problems.
Most of the times the correlation succeeds from the first try and sometimes it fails. We are not able to reproduce the issue as it seems to appear randomly.
this sounds like a race condition to me… the returning message then would arrive before the engine has commited the wait state to the database.
Starting with Camunda Platform 7.16.0, we now have a endpoint process-instance/message-async that creates a job for correlating a message instead of handling it in the same thread.
that sounds very nice, I was not aware of this new feature. Could you possibly explain for how long such a correlation-job is active and what would happen, if it is executed but the respective process-instance is still not ready for correlation?
Hello @jonathan.lukas,
May I know why async correlate message can solve this problem?
As I expected, after async correlate message, job executor will pick up the job if the thread is available.
If the thread is available but the running process instances is still not complete, isn’t it will still arrive before the engine has commited the wait state to the database?
I think the best way to solve this race condition is to always have Send and Receive message events in parallel gateway. The you should mark the send event as async before and the receive must NOT be async before. This way youre telling camunda to create the receiver at the database and only send the message in the Next cycle, so youll never send a message without being prepared for the response.
I also like to mark my Send Events as Async After, so that it commits to database soon as the message arrived and then if any next steps fail, you wont need to receive the message again.
in addition to the answer of @Jean_Robert_Alves I would like to add that correlating a message async will be executed 3 times max until the message has been correlated. This retry mechanism should prevent race conditions, however if you want to be 100% sure your message arrives, please use the proposal @Jean_Robert_Alves made.