Process instances sometimes do not appear to flush to the Camunda Database

Hello. My team has been working with Camunda for a few years now, but recently we’ve run into an odd scenario where sometimes, Camunda process instances do not appear to flush their contents to the ACT_HI_PROCINST table. This is what we’re doing roughly:

-We first do some processing and set some results in delegateExecution variables. We then set the delegateExecution process business key to something that we can refer to later.

public void execute(final DelegateExecution delegateExecution) throws IOException {
  //do some logic and set Camunda vars

  delegateExecution.setProcessBusinessKey(businessKey);
}

-Then, later on, we resume this process when a user takes action, like so:

processEngine.getRuntimeService().createMessageCorrelation("resume_message").processInstanceBusinessKey(businessKey).setVariables(vars).correlateWithResult();

The problem is, sometimes (but not always) for unknown reasons, our users are getting the following error message, which prevents them from going forward with the transaction:
“Cannot correlate message ‘resume_message’: No process definition or execution matches the parameters.”

When we go into the ACT_HI_PROCINST table to check on this, we find that there is no rows that match the businessKey. In other words, it appears as if the business key, or maybe the process instance in general, was never written to the DB. Has anyone experienced something like this before? It has left our team somewhat bewildered as to what could be going on.

As a side note, I’ve read elsewhere that Camunda does not flush its contents to the DB until it reaches a wait state. After creation, our process goes straight to an event based gateway, which waits for the resume_message. To my understanding, this means that the data should always be flushed to the DB after our process starts. We are not seeing any errors before it reaches this wait state.

Hello my friend! Welcome again!

Did you set the message name to “resume_message” in your “message intermediate catch event”?

This error seems to be occurring because when the message correlation event is triggered, the instance may not be in its gateway at that time… and this may be causing the error.

My question is… were there any cases where the message was correctly correlated?
Or are there also other types of events in your gateway that could be triggering the mechanism to make the instance “move”?

Regarding the transaction points in the database, in fact camunda only saves the state when it reaches a “stopping point”, that is, where the instance is stopped waiting for something… it could be an event-based gateway, a user task, or similar things.

But you can use the Asynchronous continuation mechanism to create a new transaction point in your process… which can be Async before or after.

Below I leave the link to the official documentation on this subject, where you can see and better understand the details.

William Robert Alves

Yes, the message intermediate catch event is named resume_message. And yes, most of the time, users are able to resume the process, and do no receive the error. We do have another message intermediate catch event called “cancel_message”, which is attached to the same process. The idea is that a user can either resume a transaction, or cancel it. It would make sense to see this error if a user previously cancelled a transaction and then tried to resume it, but this is not what we are seeing in the Camunda DB tables. Instead, when a transaction is pending on a user (before any action has been taken on the transaction), the users get the aforementioned error. When we look in the Camunda ACT_HI_PROCINST table, it appears as if Camunda never wrote the process instance to the DB. This suggests that the process may never be getting to a “wait state”.

Nothing is marked as async in our flow. Before the wait state, there is only one service task, and after the wait state there is also one service task, so I don’t believe we need async processing. I’ll attach a picture of what our flow looks like for reference.

At the end of the create transactions service task, in Java code, we set the process business to to a unique number, like so:

delegateExecution.setProcessBusinessKey(businessKey);

Later on, the user provides the business key of the transaction they want to process (and whether they want to resume or cancel the transaction). In the code, this is accomplished by doing this:

processEngine.getRuntimeService().createMessageCorrelation("resume_message").processInstanceBusinessKey(businessKey).setVariables(vars).correlateWithResult();

or

processEngine.getRuntimeService().createMessageCorrelation("cancel_message").processInstanceBusinessKey(businessKey).setVariables(vars).correlateWithResult();

By that point, the users sometimes get the error, and they are unable to take action on the transaction, because the process instance does not appear to exist in Camunda.

hello!

I’m not sure, but in your case, I believe that marking the “Create Transaction” task as an Async After / Exclusive will solve your problem.

You would have to better understand your code, and when the message correlation is being called…

But try setting the task to async after and test to see if this problematic behavior continues…

William Robert Alves

Hello again,

Despite making “Create Transaction” async after exclusive, our end users are still running into this problem sometimes (even with brand new transactions). We set the business key at the end of “Create Transaction” like so:

delegateExecution.setProcessBusinessKey(businessKey);

But yet we see no Camunda key in the ACT_HI_PROCINST table. It may be worth noting that we keep track of these transactions in our own DB as well. We also have created a Camunda listener in Java that triggers when it reaches the “wait for user action” block shown in the picture above, and it will log something like “received transaction 123”. Interestingly, when we see this problem occur, we DO see record of the affected transaction in our own DB, but we DON’T see any Camunda log from the Camunda listener, and we DON’T see the linked business key in that Camunda DB table. In these cases, the “Create Transaction” block does not throw any errors either, so we are fairly certain that this Camunda block is “exited from”, but for whatever reason, Camunda refuses to flush its data to the internal Camunda DB, and thus our users get these errors.

Is there anything else about this scenario that jumps out to anyone? Our team is still not sure why Camunda occasionally refuses to flush its data. Thanks.

Sorry if my question seems silly, but is your Camunda saving history in general?

Does your Camunda have any history settings that could be causing this behavior?
Something like this in your application properties or yml:

camunda.bpm.history-level=none

William Robert Alves

Remember that tables that contain HI in their own name are related to history, and they need to be enabled in Camunda for data to be saved in them.

William,

Good thinking, but that unfortunately doesn’t seem to be our issue in this case. Our Java code does not explicitly define the history-level, so Camunda is defaulting to FULL. Researching elsewhere, I can confirm this, because the ACT_GE_PROPERTY table reports historyLevel as 3.

We do see history for these transactions most (seemingly about 80%) of the time. It’s only in that remaining 20% that Camunda does not flush the data to the DB, and there is no associated listener log event. If it’s any help, here are our existing Camunda settings:

camunda:
  bpm:
    database:
      type: (our DB type is here)
      schema-name: TRANSACTIONS_PROD
      jdbc-batch-processing: false
      schema-update: false
      table-prefix: TRANSACTIONS_PROD.

Thanks for your continued assistance on this.

Hello my dear!

This could perhaps be some kind of concurrency, but it still seems strange to me…

What version of Camunda Platform do you use? Is the version in which you saved your Camunda flow the same version as your deployed Camunda Platform?

Have you made any recent changes to the version? If so, did you run the scripts to update the database?

Do all history tables update normally and is it just the ACT_HI_PROCINST table that is not updating 100% of the time?

Sorry for so many questions… but maybe they will help us find the possible problem!

William Robert Alves

We are currently using Camunda version 7.15.0, which we “installed” by running a series of SQL scripts which created all of the needed tables and data. We have not upgraded the version since inception. It appears that other history tables relevant to what features of Camunda we use are getting populated as expected. In our case, the ACT_HI_PROCINST table is the only table we care about when trying to resume a workflow.

However, now that you mentioned the version differences, I see that we have the following dependency in our Java code:

<dependency>
        <groupId>org.camunda.bpm</groupId>
        <artifactId>camunda-bom</artifactId>
        <version>7.18.0</version>
        <scope>import</scope>
        <type>pom</type>
      </dependency>

So the DB is 7.15.0, and the platform is 7.18.0. I’m sure this could potentially be causing issues. Our team is currently unable to release changes to our application, but we will have the opportunity soon. I will give this version change a shot, and will let you know if that’s what’s been causing all these issues. Thanks.

1 Like

Perfect!

This could be the case, as from 7.15 to 7.18 there were considerable changes, and this could be impacting or conflicting in your project!

I hope it helps to solve your problem.

:smiley:

William Robert Alves

William,

It looked like this upgrade had fixed the problem for some time, but unfortunately, we saw the same error today with one of our users. The request was a new request as well- nothing old. The situation is still the same- the Camunda entry is simply not there at the point of user action. We use a number for each transaction. Interestingly, the behavior we see is always that the specific transaction entry in ACT_HI_PROCINST is always missing. In our example, request number 2240613 was missing, but those around it, 2240612 and 2240614, were present. We suspect this might be some sort of concurrency problem, a sequence problem, or maybe some kind of Camunda configuration problem.

Is it possible that this is due to some kind of Camunda bug found in version 7.18? We are still able to upgrade to a newer version (like 7.20), if you think that may remedy this strange issue. Otherwise, is there anything that stands out to you at this point? We’re almost out of ideas as to what this could be. Thanks.

Hello my friend!

I was away for a while, but now I’m back.

I was analyzing your answers again, and I noticed that you are setting a businessKey within the “Create transaction” service task… is that right?

After an instance is started, we must not change the business key… the business key must be a unique key for the instance and easy to identify for users.

What you could do is create a variable in the “Create transaction” task and set the value of this variable as you wish, and use the value of this variable to correlate the message.

Please let me know if I understand the context correctly.

William Robert Alves

Yes, we are setting the business key during the create transaction phase. In this case, the business key is tied to a SQL numerical index. A separate SQL instance used by our app links a user’s session ID to a SQL index. Basically, that means that when a user starts a transaction, that same user should be able to later resume that transaction based on the unique SQL index tied to their ID. That same unique index is also what we use as the Business Key in Camunda. Nothing in our code should be changing this key- we see that the key itself simply is never created in Camunda, despite no errors at the time of creation (during the create transaction phase). Since there is specific data tied to each transaction, we also searched by the data to see if Camunda perhaps had the data, but under a different/wrong key. We didn’t find the expected data under any key, leading me to believe that, for whatever reason, Camunda is not saving the key at the time of create transaction.

We could set a separate Camunda variable apart from the business key and use that to resume the process I suppose. We’ve not tried that, although I’m not sure it will resolve our “vanishing Camunda data” problem. I guess it’s worth a shot.

Currently we use the business key like so to resume transactions:
processEngine.getRuntimeService().createMessageCorrelation("resume_message").processInstanceBusinessKey(businessKey).setVariables(vars).correlateWithResult();

Are you saying we can resume a process using one or more variables, like this, instead?
processEngine.getRuntimeService().createMessageCorrelation("resume_message").processInstanceVariableEquals("anotherVar", 123).setVariables(vars).correlateWithResult();