Message Start Event broken after Process Instance Migration

  • Camunda 7.6 Community
  • Everything deployed through rest api.

Migration of the following BPMN:

MessageStart

  • Activity B had active incidents.
  • Activity B is a script that runs message correlation to another process definition using the java api.

The process was deployed through the rest API.

eventCapture

The above graph is message start events. Green is Version 1, and Yellow is Version 2. You can see the migration occur just after 10:00. before the migration, Version 2’s message start event was working. After migration, the Message start event stops working.

When you test the message start event using the Rest API’s /message endpoint, a error such as:

org.camunda.commons.logging.BaseLogger.logError 
ENGINE-16004 Exception while closing command context: 
ENGINE-13031 Cannot correlate a message with name 'my_event' to a single execution.
105 executions match the correlation keys:
    [ 
    CorrelationSet[businessKey=null, 
    processInstanceId=null, 
    processDefinitionId=null, 
    correlationKeys=null, 
    tenantId=null, 
    isTenantIdSet=false
    ]
  • Deploying again to bump the process version to 3 did not resolve the issue.
  • I had to deploy a Version 3, Suspend Version 1 and Version 2 for Version 3 to start working again. Re-enabling the definition causes the issue to start again.
  • It appears the problem specifically resolves when you suspend the migrated instances.
  • It appeared that Version 2 was ignoring the Async after property on the Message Start Event.
  • The migration between versions was just a generic by-default mapping. Rest API was used to Generate, Validate, and Execute the migration.
  • There was no new activities, or changes to the activities. The migration’s purpose was just to move the active instances (which were some suspended instances on a timer, and instances that had incidents on the “B” Task.

@camunda @thorben Any ideas on what would be causing this?

recreation steps:

  1. Create a process definition with Message Start Event:

  1. Upload new version:

  1. Create Migration

  1. Run Migration

  1. Create Message:

If you suspend the migrated instances, the message start event starts to work again.

Hi Stephen,

maybe related this jira

Rob

Thats pretty close!!!

In our use case you dont need a user task to recreate the issue though. But seems to be the same issue minus the need for the user task.

Hi guys,

I agree, sounds like the issue Rob linked. This is a nasty bug because it makes process instance state inconsistent (creates subscriptions for process instances that shouldn’t have subscriptions) and there is no API way to get rid of the subscriptions. Via Java API, you can define a message correlation to correlate to start events only, which should be a workaround. However, this is not exposed via REST API. Another idea would be to migrate those instances to the same process definition but without the message start event.

Cheers,
Thorben

Are you saying that in the Migration /execute we do not have a Source/Target mapping for the Start event?

In the recreation example/steps above, we did not have source/target mappings for the start event.

I mean the following: Right now you have version 1 and 2 of Start Message Process. As a workaround, you could deploy version 3. Version 3 is the same as version 2, but instead of having a message start event, it has a none start event. You then migrate all instances from version 1 to version 3, but you start new instances by message with version 2 (or swap versions 2 and 3 so that the version you start new instances with is the latest version).

Not a great workaround for various reasons, but it avoids migrating to a process with a message start event, which is the trigger for the bug we are discussing.