Clustering: Knowing what specific Node something was executed on?

A question was raised related to some docs about Metrics Reporting:

The Reporter Identifier seems to have very confusing docs:

Metrics are reported with an identifier of the reporting party. This identifier allows to attribute reports to individual engine instances when making a metrics query. For example in a cluster, load metrics can be related to individual cluster nodes. By default the process engine generates a reporter id as <local IP>$<engine name> . The generation can be customized by implementing the interface org.camunda.bpm.engine.impl.metrics.MetricsReporterIdProvider and setting the engine property metricsReporterIdProvider to an instance of that class.

So based on this, the line: This identifier allows to attribute reports to individual engine instances when making a metrics query. For example in a cluster, load metrics can be related to individual cluster nodes., seems to read as though each individual instance of a engine (multiple nodes for a single engine) will be logged… but this does not seem to make sense? If you have 3 nodes, all for the same engine, connected to the same DB, and you are running metrics on each of the nodes, they will each generate the same metric information, and thus the Metrics Reporter does not provide any additional value… The Reporter field seems to only be valuable when you are running multiple engines in a cluster, meaning you have different engine names on the same IPs, thus allowing the IP/EngineName reporting to make sense.

@thorben can you clarify the usage here?

Furthermore, this raised a additional reporting factor: What is the way to track Engine Node Instance execution? Aka: how do you know which node specific activities were executed on? There does not seem to be a place to store this information.

@Camunda, any advice on this?

Further example:

…The LOCK_OWNER_ column is updated with a value uniquely identifying the current job executor instance. In a clustered scenario this could be a node name uniquely identifying the current cluster node…

But this information does not seem to be configurable? and it is not saved as part of the activity history?

Hi Stephen,

Not sure I understand your points. Anyway, the reporter id is generated as <local IP>$<engine name>. That means, the reporter id should be unique in both cases:

  1. One JVM runs multiple engines with different names
  2. Multiple JVMs runs one engine each with the same name (edit: on different cluster nodes)

So I don’t see why it should only be useful in case 1.

Again, not sure I understand your points. However, let me make an example: One metric that is collected is the number of flow node instances run on a process engine. These numbers will be different for each of the process engines, e.g. if you start a process instance on one of the engines, it will only count for the metrics there.

I hope I could clarify some things. If not, please try to rephrase your questions.

Cheers,
Thorben

Okay so digging deeper I am seeing some of the complexity here: so the metrics system was collecting everything as execution listeners that was generating atomic counters and your timer based collection of those counters would be stored in the db but using the reporter as the unique grouping ? The metrics would get replaced on the next timer occurrence based on the reporter. This correct?

So if we ignore the built in Camunda metrics system for a moment: the core question: is it possible to know which engine executed a specific activity (part of the history system) ?

Yes.

That is not possible.

Are there others items in the engine that depend on a unique engine name when used in a cluster ?

@thorben we were thinking about having a transaction listener that monitors execution entities. Whenever a execution entity is created we would log the Id of the Node and the execution id. Then later if someone wanted to track what activities were executed on which node, they can cross ref the execution back to the K/V store of Execution IDs and Node Ids. Do you foresee any complications with using the Execution Ids as UUIDs in this form?