Memory Leak in Camunda 7.12

Thanuja_Vadlamudi · July 8, 2024, 9:40am

I am running the Camunda process server in POD (Deployed in Rancher Kubernetes). POD is crashing every 10 minutes due to memory leakage. I see some IBatsis configuration object is taking more memory. Can you please help me where exactly it is taking more memory and how to resolve the issue.

I see the below one in the list which is taking more memory
Class Name | Ref. Objects | Shallow Heap | Ref. Shallow Heap | Retained Heap

value char[95] @ 0x6c6b28fb0 org.camunda.bpm.engine.impl.persistence.entity.TaskEntity.selectTaskCountByQueryCriteria-Inline| 1 | 208 | 208 | 208

we are using below versions.
springBootVersion = ‘2.1.9.RELEASE’
camundaVersion = ‘7.12.0’
camundaStarterVersion = ‘3.3.5’

Alex_Voloshyn · July 8, 2024, 9:56am

Hi @Thanuja_Vadlamudi
Camunda 7.12 was released five years ago and the maintenance ended three years ago so I do not think anything is going to be fixed for this release even if there is a memory leak.

Regards,
Alex

mimaom · July 8, 2024, 10:06am

Hi @Thanuja_Vadlamudi

I agree with @Alex_Voloshyn
Can you do an update to the latest version to see if that helps?

I have also used Camunda 7.12 and I cannot recall (of the top of my head) any memory issues.

Maybe the problem lies elsewhere.
Can you provide some information about your setup?

What does your process do?
Does it activities that require a lot of memory?
Does this always happen or only under heavy load / certain environments etc?

BR
Michael

Thanuja_Vadlamudi · July 8, 2024, 3:37pm

Thank you, Alex.

Hi Michael,

We had written some wrapper classes on top of Camunda Engine 7.12 and deployed the docker of that service to Rancher Kubernetes. The service is crashing after few minutes. It is not even started.
we are using the below versions.
springBootVersion = ‘2.1.9.RELEASE’
camundaVersion = ‘7.12.0’
camundaStarterVersion = ‘3.3.5’

Can you suggest me what is the step I need to take? I can’t directly jump to later version. If you suggest upgrading to latest version, to which version I need to upgrade and also let me know the steps to migrate. Even In existing Linux servers, it is automatically restarting every 2 days once. we got some alerts from AppDynamics that JVM utilization is too high.

I am not able to attach the heap dump file here. It is taking more memory on IBatsis related objects.Is there any way I can upload heap dump here?

Regards,
Thanuja.

mimaom · July 9, 2024, 5:29am

Hi @Thanuja_Vadlamudi

What do you mean by some wrapper classes on top of Camunda?
Is the crash consistent? Does it happen in all your environments? Does it also happen in your local dev environment (on a different DB)?

BR
Michael

Thanuja_Vadlamudi · July 9, 2024, 8:29am

Hi Michael,

wrapper classes mean we had defined rest template on top of the Camunda engine 7.12.

For ex: we had written rest service to get active instances. Inside that, we are using camunda classes to get active instances.
List historicActivityInstances = this.historyService
.createHistoricActivityInstanceQuery().processInstanceId(processInstanceId).list();

It is crashing everywhere. In the existing servers, Camunda Engine is crashing every 2 days since it was running on dedicated server. But In Rancher, as it is running on POD (512MB Memory specified while running the JAR), It is crashing after few mins due to Memory Leakage.

It is happening in all environment’s including local. I took the heap dump and overseed that IBatsis related objects are taking more memory.

Regards,
Thanuja.

mimaom · July 9, 2024, 8:49am

Hi @Thanuja_Vadlamudi

Is the code you show actually (always?) being executed on the crashing Camunda servers? This query will load activity instances (also the historic ones). Could you have a very big process that has run for a long while / has performed a lot of activites (are you by any chance using timers that fire very often) / a lot of messages/ / signals)?

Could you try to call the Camunda REST API with this operation, just to get the count of the activites that would be returned:

BR
Michael

Thanuja_Vadlamudi · July 9, 2024, 9:16am

Hi Michael,

I just provided one example. That is not the one which is crashing. I am not sure that where exactly the memory leak is happening. when I took the heap dump, it was showing the below one is taking more memory. we have deployed the Camunda engine service in Rancher that is not processing any traffic. But It is crashing after few mins. we have pointed to dev1 and checking it.

May I know how this service (just to get the count of the activities) related to this problem?

Regards,
Thanuja.

mimaom · July 9, 2024, 9:30am

Hi @Thanuja_Vadlamudi

That service (get activities) may not have anything to do with the problem. I was trying to fin the root cause of the problem. Your application must be performing some kind of actions. Something must be started / triggered when the server starts. How are your processes deployed? Are they part of an embedded application (Spring app). If you look inside the Camunda Cockpit after the server is started, how many process definitions are deployed, and how many process instances are running totally?

BR
Michael

Thanuja_Vadlamudi · July 9, 2024, 9:57am

I see the below count in dev cockpit.

we are using Community version (spring boot app). we will deploy process definitions as show below. we will look for specific messages and trigger the workflow based on the message. we have defined our own table swf_message_queue. Events will be listening to those messages. Event will consume the message from that table and proceed further.

Regards,
Thanuja.

mimaom · July 9, 2024, 10:22am

Hi @Thanuja_Vadlamudi

That probably explains your issues. You have almost 300.000 running process instances (and 6.600 process incidents which means processes that have encountered some kind of error).

That is quite a lot of instances. You said the Camunda crashes in all environments? Do you have that many processes running in all your environments - even in your development environment?

If these are all valid process instances that you need to support, you should look into Camunda’s cluster support. Have a look at:

and also:

You may also benefit from looking into a way to split your processes into logical (business) units. Either using multi tenancy as described here:

Or simply split your process applications into separate databases / schemas and that way reduce the number of process instances that each process application needs to handle.

With this number of running process instances, I would start looking into this. I noticed that your have 22 process definitions - perhaps they can be divided into a number of logical business units?

I hope that helps.

BR
Michael

Thanuja_Vadlamudi · July 9, 2024, 12:12pm

All environments have same data which was pulled from PROD. In that way only we can see if we have any performance issues. 22 process definitions are divided into different services as shown in diagram (Swift, Fastr, CPO and R4). All these services use camunda process engine service to create workflows.

simply split your process applications into separate databases / schemas and that way reduce the number of process instances that each process application needs to handle that does mean having different DB’s for each service. For ex: Swift has its own DB and Fastr has its own DB.

Regards,
Thanuja.

mimaom · July 9, 2024, 12:34pm

Hi @Thanuja_Vadlamudi

From the screen shot it looks like you are already using Camunda’s multi tenancy. Is that right? If so, how is the multi tenancy configured?

BR
Michael

Thanuja_Vadlamudi · July 9, 2024, 12:51pm

I think we have configured like below.

@Override
public void create(Definition processDefinition) {
if (exists(processDefinition)) {
throw new RuntimeException(“Process definition already exists”);
}

	// verify that the model loading is a camunda model
	if (!APPLICATION_CAMUNDA_BPMN.equalsIgnoreCase(processDefinition.getModelContentType())) {
		throw new RuntimeException("Process definition is not the correct content type");
	}
	
	LOGGER.info("Deploying: {} for domain {}", processDefinition.getName(), processDefinition.getDomain());
	
	/*
	BpmnModelInstance model = Bpmn.readModelFromStream(new ByteArrayInputStream(processDefinition.getModel().getBytes()));
	CamundaProperty domainProperty = model.newInstance(CamundaProperty.class);
	domainProperty.setCamundaName("domain");
	domainProperty.setCamundaValue(processDefinition.getDomain());
	*/
	
	DeploymentWithDefinitions dd = repositoryService.createDeployment()
	.name(processDefinition.getName())
	.tenantId(processDefinition.getDomain())
	.addInputStream(processDefinition.getName(), getModelAsStream(processDefinition))
	.deployWithResult();
	
	LOGGER.info("Deployment Complete ID: {}, Name: {}, Tenant ID: {}", dd.getId(), dd.getName(), dd.getTenantId());
	
}

mimaom · July 10, 2024, 7:36am

Hi @Thanuja_Vadlamudi

I was thinking more about if the multi tenancy is properly configured and is already split into multiple databases?

Otherwise I am running out of suggestions.

But I still suspect that the large number of process instances is the place to look.

BR
Michael

Thanuja_Vadlamudi · July 10, 2024, 8:38am

Hi Michael,

Multi tenancy is properly configured. But we are using same Database for all the services (Swift, Fastr,R4 and CPO).

Regards,
Thanuja.

mimaom · July 10, 2024, 10:35am

Hi @Alex_Voloshyn

OK, so to reap: You are running a number of Spring Boot applications with an embedded process engine. Each application is configured to use multi tenancy, but with the same database - which means your database uses a tenant identifier column to isolate the tenants.

That setup should limit the number of processes instance to be isolated to a tenant, but I do not have much experience with multi tenancy. Maybe there something missing / wrong?

So you say this also happens in your dev environment (which has the same data in its database). Do all your spring boot applications crash when they are started. What happens if you only start a single application in your dev environment?

BR
Michael

Thanuja_Vadlamudi · July 10, 2024, 1:10pm

Hi Michael,

I have downloaded and run camunda-bpm-run-7.13.0 in local by pointing to our dev1 DB from Camunda Download Center -.

It is throwing an error -ENGINE-16004 Exception while closing command context: ENGINE-03004 Exception while executing Database Operation ‘UPDATE EverLivingJobEntity[97df3af7-45bf-11e9-bfc4-1866da3af980]’ with message ’

Error flushing statements. Cause: org.apache.ibatis.executor.BatchExecutorException: org.camunda.bpm.engine.impl.persistence.entity.JobEntity.updateEverLivingJob (batch index #1) failed. Cause: java.sql.BatchUpdateException: ORA-00904: “FAILED_ACT_ID_”: invalid identifier.

I think this is due to Camunda engine version mismatch. we are using camundaVersion = ‘7.12.0’. Can you please let me know how I get camunda-bpm-run-7.12.0?

In the existing environment, we are using oracle jdk 1.8 with camundaversion 7.12.0 (But in cockpit, It is showing Powered by camunda BPM / v7.11.0). But In Rancher we are building the image with OpenJDK 1.8. Is there any issue with this JDK version?

Regards,
Thanuja.

mimaom · July 10, 2024, 1:41pm

Hi @Thanuja_Vadlamudi

There is no Camunda run for version 7.12.
If you want to upgrade, make sure to read the update notes. There will most likely also be an SQL script you need to run.

You should add the updates one at a time and run the SQL scripts for each update to move from 7.12 → 7.13 → 7.14 etc. And if you want to go that way, you might also want to create a separate database just for testing the update!

That your Cockpit says you are running 7.11, but your are using 7.12 binaries in your applications, means that you also did not apply the SQL update scripts from version 7.11 → 7.12.

You really should make sure your Camunda database version matches your Camunda binaries! Otherwise there is no saying what kind of trouble you could be facing.

BR
Michael

Thanuja_Vadlamudi · July 10, 2024, 4:02pm

I was thinking the same. To update the scripts from 7.11 → 7.12, Can you provide me the steps. how do I get the scripts to update from 7.11 to 7.12

Regards,
Thanuja.