Manipulating process history

For my master thesis i am exploring how i can use the process engine for discrete event simulation. Thanks to its well designed interfaces this has thus far been gone without any major problems.

My intention now being to override all relevant timestamps as generated when the process is being executed in real time with simulated times. This will give the impression of a long running process.

The goal here being to be compliant with camunda optimize such that i can use this for analysis.

However, navigating the database has been a bit tricky and was hoping i could get some input on this…
Where and what data does optimize pull from the database? Would be very useful if i could get insight into what queries it executes.

I initially hoped that it would perform a clean import of the history tables after forcing the reimport script, but there is seemingly much more logic happening in the background.

E.g:
I first imported all history and created a single report.
Then deleted a single row (process instance) from ACT_HI_PROCINST
(original process had 8 instances. After deleting it should have 7)
Then forced a refresh via the reimport script in camunda optimize.
To my surprise i am now told that this same process has 1 instances as opposed to the expected 7?

Would be great if i (and the community) could get some input as to how this is pieced together :slight_smile:

Thanks!

Herman

Currently figuring out how i can have zero persistence such that every time i spin up the process engine and optimize i get to start from a clean slate…

Excuse the ranting but this might be useful for the next guy so i might as well document my findings…

Nuking elasticsearch is no bueno because you have to input the license key every time (big hassle).

After poking around in elasticsearch i am starting to see how this is configured. One could easily create a script that effectively resets everything on startup.

Notes:
elasticsearch is available on port 9200 by default (have a look at the optimize config)

elasticsearch does not appear to offer their own ui, but there are many projects which solve this for you. Have a look on github or google.

Have compared generated indicies and found the following:
Delete all documents in following indicies:
optimize-process-definition_v4
optimize-single-process-report_v6
optimize-dashboard_v4

Delete entire index with documents of all indicies prefixed with “optimize-process-instance-”
optimize-process-instance-gotostore123_v6

Hi @hpl002,

Optimize queries data from quite a few different engine databases during its import cycle. This page in our Optimize docs gives you a broad overview of how the import cycle within Optimize works, though this doesn’t include much detail on exact queries or database sources.
However, Optimize uses a separate Optimize endpoint within the engine, you can have a look at some of the implementation here, this might help give you some insight into what kind of data Optimize queries.

It’s hard to tell what exactly happened here without seeing the data and logs, but generally speaking there is no one-to-one relationship between engine and Optimize entities. In the case of process instances, we query a few different tables and merge this data into Optimize’s ProcessInstance tables, so the ACT_HI_PROCINST is only one of the databases Optimize imports instance data from.

You can add the license key as a file to the path ${optimize-root-folder}/config/OptimizeLicense.txt so it doesn’t rely on what’s saved in ES. If you’re using docker, you can check here how to add the license key file.

Hope that helps!

1 Like

Thank you for the detailed reply! This will undoubtedly help me make some progress