History cleanup not working

I’m running with the docker distribution 7.10.

I have a lot of old data that I want to remove that does not have removal times set.
I updated my bpmn with a TTL and redeployed. I then ran the rest api command to update the TTL on all the old process definitions.

I’ve tried running the history cleanup api call to immediately run the job, but it doesn’t seem to do anything. My cleanup job is there in the jobs table but the time is always in the future. it’s unclear if it’s actually running but it is certainly not deleting anything.

here is the section of my bpm.platform.xml:

 <properties>
      <property name="history">activity</property>
      <property name="jobExecutorActivate">true</property>
      <property name="databaseSchemaUpdate">true</property>
      <property name="authorizationEnabled">true</property>
      <property name="jobExecutorDeploymentAware">false</property>
      <property name="historyCleanupBatchWindowStartTime">00:02</property>
      <property name="historyCleanupBatchWindowEndTime">23:59</property>
      <property name="historyCleanupStrategy">endTimeBased</property>
    </properties>

I have been reading other threads on this issue and trying things that were suggested but other than going into the database and manually deleting all the old data, I haven’t found another thread that fixes my problem.

Thanks,
Megan

Nothing is immediately jumping out at me… just for some sanity checking, can you post the results of a few REST API calls? Just to verify things are as we suspect:

GET /history/cleanup/configuration
GET /history/cleanup/jobs
GET /process-definition

That last one could be a bit lengthy depending on how many definitions you have. Maybe just pick one definition to focus on while you track down the issue? Mainly just looking to verify that history TTL is set for the definitions you want to clean up associated instances for.

Lastly, how are you verifying that history is being cleaned up (or in this case not cleaned up)? In past threads, I’ve found that the expectation of what should be removed mismatches what, in reality, is getting removed.

thanks for your response.
GET /history/cleanup/configuration
returns:{
“batchWindowStartTime”: “2020-10-28T00:02:00.000+0000”,
“batchWindowEndTime”: “2020-10-28T23:59:00.000+0000”
}
GET /history/cleanup/jobs
returns: {
“id”: “3528f452-092f-11e9-97b6-0242ac180002”,
“jobDefinitionId”: null,
“processInstanceId”: null,
“processDefinitionId”: null,
“processDefinitionKey”: null,
“executionId”: null,
“exceptionMessage”: null,
“retries”: 3,
“dueDate”: “2020-10-28T18:24:10.000+0000”,
“suspended”: false,
“priority”: 0,
“tenantId”: null,
“createTime”: “2018-12-26T16:56:37.000+0000”
}

GET /process-definition
{
“id”: “cmeProcess:7:c432d37e-dbff-11ea-a2e0-0242ac1c0002”,
“key”: “cmeProcess”,
“category”: “http://bpmn.io/schema/bpmn”,
“description”: null,
“name”: null,
“version”: 7,
“resource”: “cmeProcess.bpmn”,
“deploymentId”: “c42f29fb-dbff-11ea-a2e0-0242ac1c0002”,
“diagram”: null,
“suspended”: false,
“tenantId”: null,
“versionTag”: null,
“historyTimeToLive”: 5,
“startableInTasklist”: true
},
{
“id”: “dacProcess:5:412a5669-8bac-11e9-81ba-0242ac1e0002”,
“key”: “dacProcess”,
“category”: “http://bpmn.io/schema/bpmn”,
“description”: null,
“name”: “dac”,
“version”: 5,
“resource”: “dacProcess.bpmn”,
“deploymentId”: “41248a07-8bac-11e9-81ba-0242ac1e0002”,
“diagram”: null,
“suspended”: false,
“tenantId”: null,
“versionTag”: null,
“historyTimeToLive”: 5,
“startableInTasklist”: true
}, …

To determine if the jobs are doing anything, I was looking at the size of the history tables. Right now I am working with our staging server and nothing is really happening on that right now other than what I am doing, and the size of the tables are identical to when I started trying to configure this history stuff yesterday.

also, in my large ACT_HI_ACTINST table I have hundreds of instances whose start and end times are up to 2 years old.

I’m at a bit of a loss, as things are looking as I’d expect. I even took your <properties> and dropped them into a 7.10 container and was able to cleanup history just fine, so there’s something else happening…

Just for another sanity check, can you post the output of one of the following:
GET /history/process-instance?finished=true&processDefinitionId=cmeProcess:7:c432d37e-dbff-11ea-a2e0-0242ac1c0002
or
GET /history/process-instance?finished=true&processDefinitionId=dacProcess:5:412a5669-8bac-11e9-81ba-0242ac1e0002

Just looking to double check that there are candidates that would fall within the endTime + TTL window as we’d expect.

In our scenario I see the records are getting deleted as per the TTL set but it doesn’t take care of reclaiming the space in the database like it does when we manually truncate the tables based on partitions created. Is there any configuration or setting that we can make use of to achieve a true cleanup?
version used : 7.10

Hi @Gopinath,

on some databases, there is a difference between deleting records in the database and freeing up the diskspace.

The latter one is usually a low level database operation command and not given by the application.

Hope this helps, Ingo

1 Like

Hi Megha,

I am also facing the same issue, any solution ?

Thanks,
Gowtham