I am currently using Camunda 7.10.0 in a Docker environment.
Since a few days, we’re experiencing a very poor performance (long response times > 30-40s or even timeouts) when starting process instances by definition key via the REST API.
The performance issues occur with all of the process definitions, but in particular happen on large BPMN models (one example is around 300KB of XML).
Here are a few background infos I figured might help, maybe somebody here notices something odd that may cause the performance issues:
The server runs a MySQL container as database as well as the Camunda container
The server is a Digitalocean standard droplet w/ 6 vCores & 16GB of memory
Camunda’s heap memory size is configured to be 8GB max. (from which Camunda uses all 8GB)
History level is set to ACTIVITY
Currently running 1.000 active process instances waiting at intermediate timer events
Approx. 40.000 variables in the ACT_RU_VARIABLE table
Approx. 7.500 deployments
Most of the tasks in our process definitions are external tasks or service tasks w/ Expressions (simple calculations)
We run 9 external task workers (on other servers) which each poll tasks in an interval of 300-500ms via REST
Often times, the first start of a process definition (after not starting that particular definition for a while) takes a long time and directly after that, starts blazingly fast. Maybe there is some caching of the process definition happening in Camunda on process instance creation?
I hope that somebody here experienced similar issues and might help out with some tips & tricks.
Hi Max, that external task worker polling interval seems very short and it could be your issue. You could increase the polling interval (ie several seconds or minutes) and see what happens or take advantage of long polling to see if that helps. If you require such short intervals you may want to consider changing your external workers to something synchronous like Java delegates or scripts.
Thanks @Beagler ! I implemented long polling now, which reduced the amount of requests from the external workers. My problem persists though.
This morning I had some Process Engine persistence exception errors like this:
09:58:09.462 WARNING [http-nio-8080-exec-126] org.camunda.bpm.engine.rest.exception.ProcessEngineExceptionHandler.toResponse org.camunda.bpm.engine.ProcessEngineException: Process engine persistence exception
at org.camunda.bpm.engine.impl.interceptor.CommandInvocationContext.rethrow(CommandInvocationContext.java:150)
at org.camunda.bpm.engine.impl.interceptor.CommandContext.close(CommandContext.java:177)
at org.camunda.bpm.engine.impl.interceptor.CommandContextInterceptor.execute(CommandContextInterceptor.java:115)
at org.camunda.bpm.engine.impl.interceptor.ProcessApplicationContextInterceptor.execute(ProcessApplicationContextInterceptor.java:69)
at org.camunda.bpm.engine.impl.interceptor.LogInterceptor.execute(LogInterceptor.java:32)
at org.camunda.bpm.engine.impl.externaltask.ExternalTaskQueryTopicBuilderImpl.execute(ExternalTaskQueryTopicBuilderImpl.java:59)
at org.camunda.bpm.engine.rest.impl.FetchAndLockHandlerImpl.executeFetchAndLock(FetchAndLockHandlerImpl.java:227)
at org.camunda.bpm.engine.rest.impl.FetchAndLockHandlerImpl.tryFetchAndLock(FetchAndLockHandlerImpl.java:210)
at org.camunda.bpm.engine.rest.impl.FetchAndLockHandlerImpl.addPendingRequest(FetchAndLockHandlerImpl.java:281)
at org.camunda.bpm.engine.rest.impl.FetchAndLockRestServiceImpl.fetchAndLock(FetchAndLockRestServiceImpl.java:37)
at sun.reflect.GeneratedMethodAccessor104.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.jboss.resteasy.core.MethodInjectorImpl.invoke(MethodInjectorImpl.java:137)
at org.jboss.resteasy.core.ResourceMethodInvoker.invokeOnTarget(ResourceMethodInvoker.java:296)
at org.jboss.resteasy.core.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:250)
at org.jboss.resteasy.core.ResourceLocatorInvoker.invokeOnTargetObject(ResourceLocatorInvoker.java:140)
at org.jboss.resteasy.core.ResourceLocatorInvoker.invoke(ResourceLocatorInvoker.java:103)
at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:377)
at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:200)
at org.jboss.resteasy.plugins.server.servlet.ServletContainerDispatcher.service(ServletContainerDispatcher.java:220)
at org.jboss.resteasy.plugins.server.servlet.HttpServletDispatcher.service(HttpServletDispatcher.java:56)
at org.jboss.resteasy.plugins.server.servlet.HttpServletDispatcher.service(HttpServletDispatcher.java:51)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:741)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:53)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.camunda.bpm.engine.rest.filter.CacheControlFilter.doFilter(CacheControlFilter.java:44)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.camunda.bpm.engine.rest.filter.EmptyBodyFilter.doFilter(EmptyBodyFilter.java:98)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:199)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96)
at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:490)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:139)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:92)
at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:668)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:74)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:343)
at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:408)
at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66)
at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:770)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1415)
at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.ibatis.exceptions.PersistenceException:
### Error querying database. Cause: org.apache.tomcat.jdbc.pool.PoolExhaustedException: [http-nio-8080-exec-126] Timeout: Pool empty. Unable to fetch a connection in 30 seconds, none available[size:20; busy:20; idle:0; lastwait:30000].
### The error may exist in org/camunda/bpm/engine/impl/mapping/entity/ExternalTask.xml
### The error may involve org.camunda.bpm.engine.impl.persistence.entity.ExternalTaskEntity.selectExternalTasksForTopics
### The error occurred while executing a query
### Cause: org.apache.tomcat.jdbc.pool.PoolExhaustedException: [http-nio-8080-exec-126] Timeout: Pool empty. Unable to fetch a connection in 30 seconds, none available[size:20; busy:20; idle:0; lastwait:30000].
at org.apache.ibatis.exceptions.ExceptionFactory.wrapException(ExceptionFactory.java:30)
at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:150)
at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:141)
at org.camunda.bpm.engine.impl.db.sql.DbSqlSession.selectList(DbSqlSession.java:97)
at org.camunda.bpm.engine.impl.db.entitymanager.DbEntityManager.selectListWithRawParameter(DbEntityManager.java:183)
at org.camunda.bpm.engine.impl.db.entitymanager.DbEntityManager.selectList(DbEntityManager.java:175)
at org.camunda.bpm.engine.impl.persistence.entity.ExternalTaskManager.selectExternalTasksForTopics(ExternalTaskManager.java:88)
at org.camunda.bpm.engine.impl.cmd.FetchExternalTasksCmd.execute(FetchExternalTasksCmd.java:69)
at org.camunda.bpm.engine.impl.cmd.FetchExternalTasksCmd.execute(FetchExternalTasksCmd.java:39)
at org.camunda.bpm.engine.impl.interceptor.CommandExecutorImpl.execute(CommandExecutorImpl.java:27)
at org.camunda.bpm.engine.impl.interceptor.CommandContextInterceptor.execute(CommandContextInterceptor.java:106)
... 49 more
Caused by: org.apache.tomcat.jdbc.pool.PoolExhaustedException: [http-nio-8080-exec-126] Timeout: Pool empty. Unable to fetch a connection in 30 seconds, none available[size:20; busy:20; idle:0; lastwait:30000].
at org.apache.tomcat.jdbc.pool.ConnectionPool.borrowConnection(ConnectionPool.java:712)
at org.apache.tomcat.jdbc.pool.ConnectionPool.getConnection(ConnectionPool.java:198)
at org.apache.tomcat.jdbc.pool.DataSourceProxy.getConnection(DataSourceProxy.java:132)
at org.apache.ibatis.transaction.jdbc.JdbcTransaction.openConnection(JdbcTransaction.java:138)
at org.apache.ibatis.transaction.jdbc.JdbcTransaction.getConnection(JdbcTransaction.java:60)
at org.apache.ibatis.executor.BaseExecutor.getConnection(BaseExecutor.java:336)
at org.apache.ibatis.executor.BatchExecutor.doQuery(BatchExecutor.java:90)
at org.apache.ibatis.executor.BaseExecutor.queryFromDatabase(BaseExecutor.java:324)
at org.apache.ibatis.executor.BaseExecutor.query(BaseExecutor.java:156)
at org.apache.ibatis.executor.CachingExecutor.query(CachingExecutor.java:109)
at org.apache.ibatis.executor.CachingExecutor.query(CachingExecutor.java:83)
at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:148)
... 58 more
After seeing that error, I implemented the test on borrow feature for my database connections as suggested here in the forum, which leads to a weird behaviour:
It seems like the Process Enigne needs to “warm up” before it can function with proper speed. What does that mean? Whenever I start a process after a few minutes of doing nothing with Camunda, it takes 30-60s to start that process. If I now try to start a process with the same definition right after that, it starts brilliantly fast.