Engine-16004 - npe

Hello Guys,

lately I’m running into some strange semi-reproducable Exceptions.
Basic information on my setup:

  • Spring-Boot (multiple micro-services)
  • Shared process engine (each service operates on “default”-engine and same database)
  • each service is set to “deployment-aware”
  • each Service-Task (in each process) is explicitly marked as “Async Before”
  • multiple datasource (in each microservice “@Primary”-Datasource is linked to the services database and camunda-datasource is configured using “camundaBpmDataSource”)
  • inter-service communication via ActiveMQ

The exception looks like this:

[camundaTaskExecutor-3] ERROR [] org.camunda.bpm.engine.context.logError [160] - ENGINE-16004 Exception while closing command context: null
java.lang.NullPointerException: null
        at org.camunda.bpm.engine.impl.pvm.runtime.LegacyBehavior.isAsync(LegacyBehavior.java:541)
        at org.camunda.bpm.engine.impl.pvm.runtime.LegacyBehavior.repairMultiInstanceAsyncJob(LegacyBehavior.java:570)
        at org.camunda.bpm.engine.impl.jobexecutor.AsyncContinuationJobHandler.execute(AsyncContinuationJobHandler.java:67)
        at org.camunda.bpm.engine.impl.jobexecutor.AsyncContinuationJobHandler.execute(AsyncContinuationJobHandler.java:40)
        at org.camunda.bpm.engine.impl.persistence.entity.JobEntity.execute(JobEntity.java:133)
        at org.camunda.bpm.engine.impl.cmd.ExecuteJobsCmd.execute(ExecuteJobsCmd.java:110)
        at org.camunda.bpm.engine.impl.cmd.ExecuteJobsCmd.execute(ExecuteJobsCmd.java:43)
        at org.camunda.bpm.engine.impl.interceptor.CommandExecutorImpl.execute(CommandExecutorImpl.java:28)
        at org.camunda.bpm.engine.impl.interceptor.CommandContextInterceptor.execute(CommandContextInterceptor.java:110)
        at org.camunda.bpm.engine.spring.SpringTransactionInterceptor$1.doInTransaction(SpringTransactionInterceptor.java:46)
        at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:140)
        at org.camunda.bpm.engine.spring.SpringTransactionInterceptor.execute(SpringTransactionInterceptor.java:44)
        at org.camunda.bpm.engine.impl.interceptor.ProcessApplicationContextInterceptor.execute(ProcessApplicationContextInterceptor.java:70)
        at org.camunda.bpm.engine.impl.interceptor.CommandCounterInterceptor.execute(CommandCounterInterceptor.java:35)
        at org.camunda.bpm.engine.impl.interceptor.LogInterceptor.execute(LogInterceptor.java:33)
        at org.camunda.bpm.engine.impl.jobexecutor.ExecuteJobHelper.executeJob(ExecuteJobHelper.java:57)
        at org.camunda.bpm.engine.impl.jobexecutor.ExecuteJobsRunnable.executeJob(ExecuteJobsRunnable.java:110)
        at org.camunda.bpm.engine.impl.jobexecutor.ExecuteJobsRunnable.run(ExecuteJobsRunnable.java:71)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)

Sometimes the exception is just thrown once (because of “Async Before” an immediate retry is attempted) and the process steps over. But sometimes I received 3 Exceptions in a row and run in an Incident.

Exception is semi-reproducable because I have a process in which I can consistently reproduce this exception, but I’m unable to isolate the issue into a simpler process. Exception consistently appears when I’m sending a JMS-Message from Service B which is received by Service A and then turned into a message-correlation on a process already running in context of Service A. The process itself is waiting at an Eventbased Gateway at this point. Transferring this constellation (even with multi-service, shared process engine, and ActiveMQ) into a smaller isolated project fails to reproduce this exception. At this point I’m pretty much stuck and hope some of you guys are able to help me out.

Cheers,

Jan

Hi Jan,

Thank you for sharing the details of you setup. I have some further questions:

  1. What is the versions of your setup - Spring boot, Spring boot starter, Camunda engine? It would be great if you can share you pom.xml (or gradle configuration).
  2. You mentioned that the issue occurs lately, was the process running before without an issue? If yes, what did it change, update of a version maybe - starter, engine upgrade or deployment of a new version of the process definition?
  3. Could you please upload an example of a process model so we can have a look at it?

Best regards,
Yana

Hi Yana,

thanks for the reply. I’d be glad to answer your questions:

  1. We’re running Camunda on 7.14.0 and Spring-Boot on 2.4.2 (by documentation 7.14.0 only supports Spring-Boot up to version 2.3.9.RELEASE; which I haven’t tested against yet since a back-port of the Spring-Boot Version would cause different headaches).
  2. “Lately” is not precise enough. We were also running into this issue on Camunda 7.13.0 and Spring-Boot 2.3.[can’t remember]. It’s only been “lately” that our process manages to run into this issue 3 times in a row, which then causes an Incident and the process coming to an unexpected halt.
  3. Find one of our process attached (this is the one on which I can constantly reproduce the exception; I just don’t know how and why). The exception is thrown when the main-process is at the first event based gateway. A different service from the one running the process is sending a JMS-Message via ActiveMQ. The message is received and a message-correlation t(messageName = ‘ECO_MSG_FM_LINE_ORDER_DELIVERY_DATE_DETERMINED’) to the process is triggered:
workflowRuntime
                .createMessageCorrelation(messageKey)
                .processInstanceId(processInstanceId)
                .setVariables(variables)
                .correlateAllWithResult();

eco-proc-fm-line-order.bpmn (89.8 KB)

All the best

Jan

Hi Yana,
Hi Community,

this seems to be a somehow tricky and hard to answer question.
I found a solution that would work at least for us: drumroll

tl;dr Instead of only providing a dedicated Datasource-configration also provide a dedicated PlatformTransactionManager.
Since @Bean(name = "camundaBpmTransactionManager") will only be supported up from the upcoming 7.15 release, the only working solution I found was to create an own bean public class MyCamundaDatasourceConfiguration extends AbstractCamundaConfiguration implements CamundaDatasourceConfiguration. Inside this class I copied the preInit()-Implementation from DefaultDatasourceConfiguration replacing not only the datasource but the transactionManager as well. Ah and you have to be sure that this configuration is present before the CamundaBpmAutoConfiguration is instantiated otherwise your Datasource-configuation might get overriden by the DefaultConfiguration (I achieved this by providing my own SBoot Camunda Autostart and the @AutoConfigureBefore(CamundaBpmAutoConfiguration.class).
As mentioned from Camunda Version 7.15 onwards this can be achieved using the aforementioned @Bean(name = "camundaBpmTransactionManager").

One final word not related to this topic (those only here for the solution can stop reading):
Since this problem took me some time to get a grip on (and even by now I’m not really sure why providing a dedicated PlatformTransactionManager would solve it), I requested (paid) Support via Camunda contact Form (twice). I’m not sure whether the contact form is in service or not (since I did not even receive a confirmation that my request had been registered). But aside from that: is it company policy that community users are not receiving support when they are willing to pay for it?

Cheers Jan

Hi Jan,
I’m really happy that you found a solution and that you took the time to post the solution. So thanks for that.

Regarding the help requested from support, i’m sorry you never got any reply. I’m going to follow up with the folk who deal with those requests and find out why.
If they had replied they probably would have let you know that we don’t offer any kind of technical or consulting support to community users. Only people with an enterprise contract will be able to get that kind of help.
This is partly down to the fact that we don’t have the bandwidth to deal with both open source users and enterprise users. We also spend a lot of time helping the open source community to grow in order to become self sustaining so hopefully people will feel that they can use the community version and not require consulting or support. Hope that clarifies things :slight_smile: