Incomplete set of started processes

Hi,

I’m seeing something that I can’t quite explain, and maybe you can help me understand.
If I launch 10 processes via the Java API (using “runtimeService.startProcessInstanceByKey”), I see that 10 are started in the Cockpit. So far so good. However, only 7 out of 10 actually move past the start event. The process takes several minutes to finish. Once the first 7 processes finish, the remaining 3 processes finally move past the startEvent.

The reason that I know they are stalled on the startEvent is that I have Java ExecutionListener in my code, and it logs when the process transitions past the startEvent. Also, the process itself logs things as it goes, and I only see logging for the first 7, then I see logging for the next 3 start after that.

I have my max job executor threads set to 10, so that shouldn’t be the issue… It’s as if something in the engine is only letting 7 processes go through at a time.

Any ideas on what might be causing this?

Thanks,
Galen

Hi Galen,

You could see if MaxJobsPerAcquisition makes a difference…

Some other obscure bottlenecks I have come across;

  • Insufficient connections in JDBC pool…
  • Limited connections in HTTP client used in service tasks…

regards
Rob

Thanks Rob,

Here are some recent updates:

I have tried bumping up maxJobsPerAquisition to 15 (in bpm-platform.xml), and that doesn’t seem to help.

When I start the 10 processes, I also monitored the threads via jconsole, and I see only 7 threads that look like “pool-2-thread-X”. For example, “pool-2-thread-1” through “pool-2-thread-7”. Once the processes are done executing the number of threads goes back down to 3.

So next thing I tried was setting maxPoolSize set to 15. This didn’t work either. Monitoring the threads, it only went up to 7 threads. During this test I also had “corePoolSize” set to 5. My assumption was that corePoolSize would grow to maxPoolSize as necessary.

Through experimentation, it turns out that if corePoolSize is set to anything less than 7, only seven will be picked up by the job executor.

The documentation here:
https://docs.camunda.org/manual/7.4/reference/deployment-descriptors/tags/job-executor/#job-executor-configuration-properties

does not explain how/why the number of threads will grow to the maxPoolSize. When does this happen?

So in the end, it was not JDBC connection or client connections, but rather setting the corePoolSize to a higher value. It seems as though there need to be this many threads already ready to go. I am going to experiment some more with these settings. This may also explain why I’ve been seeing some apparent slowness in the JE grabbing jobs, when I’m running a lot of processes.

Thanks,
Galen

Hi Galen,

In what setup do you use the engine? Shared or embedded engine? If shared, on which application server? Which Camunda version? Please also share the code/xml that configures the engine and job executor.

Cheers,
Thorben

I’m using a shared engine on the Tomcat. Camunda version 7.4.

server.xml:

<Resource name="global/camunda-bpm-platform/process-engine/ProcessEngineService!org.camunda.bpm.ProcessEngine Service" auth="Container"
              type="org.camunda.bpm.ProcessEngineService"
              description="camunda BPM platform Process Engine Service"
              factory="org.camunda.bpm.container.impl.jndi.ProcessEngineServiceObjectFactory" />
              
    <Resource name="global/camunda-bpm-platform/process-engine/ProcessApplicationService!org.camunda.bpm.ProcessApplicationService" auth="Container"
              type="org.camunda.bpm.ProcessApplicationService"
              description="camunda BPM platform Process Application Service"
              factory="org.camunda.bpm.container.impl.jndi.ProcessApplicationServiceObjectFactory" />
              
  </GlobalNamingResources>
...

Here’s the relevant part of bpm-platform.xml. NOTE that I set CorePoolSize and MaxPoolSize not through bpm-platform.xml, but instead by setting it via JMX. The JMX set call is made when I bring up my application server.

		<job-executor>
		<job-acquisition name="default">
			<properties>
				<property name="lockTimeInMillis">300000</property>
				<property name="waitTimeInMillis">3000</property>
				<property name="maxJobsPerAcquisition">3</property>
			</properties>
		</job-acquisition>
	</job-executor>

	<process-engine name="default">
		<job-acquisition>default</job-acquisition>
		<configuration>org.camunda.bpm.engine.impl.cfg.StandaloneProcessEngineConfiguration</configuration>
		<datasource>java:jdbc/ProcessEngine</datasource>

		<properties>
			<property name="history">activity</property>
			<property name="databaseSchemaUpdate">true</property>
			<property name="authorizationEnabled">true</property>
			<property name="jobExecutorDeploymentAware">true</property>
			<property name="jobExecutorActivate">true</property>
		</properties>

		<plugins>
			<!-- plugin enabling Process Application event listener support -->
			<plugin>
				<class>org.camunda.bpm.application.impl.event.ProcessApplicationEventListenerPlugin</class>
			</plugin>

I’m using Spring, and in my applicationContext.xml, I get access to the services like this:

<!-- bind the process engine service as Spring Bean -->
<bean name="processEngineService" class="org.camunda.bpm.BpmPlatform" factory-method="getProcessEngineService" />

<!-- bind the default process engine as Spring Bean -->
<bean name="processEngine" factory-bean="processEngineService" factory-method="getDefaultProcessEngine" />

<bean id="repositoryService" factory-bean="processEngine" factory-method="getRepositoryService"/>
<bean id="runtimeService" factory-bean="processEngine" factory-method="getRuntimeService"/>
<bean id="taskService" factory-bean="processEngine" factory-method="getTaskService"/>
<bean id="historyService" factory-bean="processEngine" factory-method="getHistoryService"/>
<bean id="identityService"   factory-bean="processEngine" factory-method="getIdentityService" />
<bean id="managementService" factory-bean="processEngine" factory-method="getManagementService"/>

I wonder if threadpoolexecutor policy is defaulting to caller runs policy? Could it be you get 3 jobs which run as there are idle threads. Next 3 more jobs are queued as the queue is empty. You then get 3 more jobs. As queue is full, 3 more threads are created but this takes us over core pool size. On next acquisition 3 more jobs may result in rejection policy, hence the job acquisition thread runs 1 job. If this blocks for two mins, there will be no more job acquisition… This is a little simplified, but you can see how it kinda leads to 7 threads… I will have to experiment a little more, but it could be an avenue worth exploring.
Rob

I think I follow what you are saying there, and that seems like a plausible theory. If that were the case, then changing the default queueSize parameter (from the default of 3) to something else might change the 7 to a different number? That might be something to try. But conceptually, I don’t think the queue should really matter in terms of the total number that can run. It should get filled up at every JE poll time (every 5 seconds or so)… The queue should be drained (and new threads created) as long as the number of active threads is less than MaxPoolSize. At least that’s what I would expect to happen.

Thanks,
Galen

Hi Galen, if you carefully read the Java docs on ThreadPoolExecutor, the behaviour is quite interesting and also left me with a lot more questions …

Anyway, I went through the engines code base and it seems to be explicit in setting the rejection policy to exception. This is a good thing as it means the acquisition thread should not be executing jobs.

I ran through a few ThreadPoolExecutor scenarios on paper, but based on documented behaviour, I could not reproduce your experience. I stress documented behaviour as I am still puzzled by a few scenarios… If I get some time, I will dig a little deeper…

Rob