Active active setup for camunda spring boot embedded engine microervice

Hi , Current we are trying to do the analysis - in active active multi-site setup scenario how camunda spring boot embedded engine microservice will work ?
Oracle Db will be there in the both site and will be replicating using goldengate and microservice apps will be running on top of the DBs.

Now think about a scenario - order received by one site(Say site A) and microservice(spring embeded camunda) ms-1 start processing . So the DB at Site A is getting populated with the transactions. GG replication will now replicate the data at Site B .
Say now there is a timer which is in the process , so both side the process engine is active ,even though order is not process via site B , but due to data replication both site will the same process instance and in active state and waiting at timer. Now when time will come both side timer will fire and create a mess up . Am i thinking rightly ? How the active active setup for embeded camunda spring boot can be achieved with both side DB replication is in place ?

@anisk To prevent the job acquisition on node 2 from picking jobs that belong to node 1, the process engine can be configured as deployment aware , by the setting following property in the process engine configuration:

<process-engine name="default">
  ...
  <properties>
    <property name="jobExecutorDeploymentAware">true</property>
    ...
  </properties>
</process-engine>

Refer this documentation for job-execution-in-heterogeneous-clusters.

@aravindhrs Thanks for your prompt response. If i do that then the job at site B would not kick start . But the main aim is to have high availability using active active . Now say if I do the configuration ,I have to ensure - only one site to receive order and process that as other site(site B) would not execute any job .
Now say- site A where Job execution is enable is receiving order but at some point site A is down and now all orders and jobs are to be executed via siteB . As site B is configured not to execute any job , active active set up will not be able to fully functional unless I do change in process engine configuration of site B when Site A is down.
Pls validate my understanding !!

Hi @anisk

Im assuming that your architcture is a common springboot app with two deployments using an active/active DB cluster…

A camunda engine requirement is the transaction isolation level must be read comitted. If you use an active/active db tier, it looks like goldengate does an asynchronous change data capture replication. Hence in multi-master mode, you have eventual consistency which will break the requirements of the camunda engine.

For high availability, you could use a master/slave pattern in the db tier, but your springboot applications would need to failover at the DB tier.

AWS RDS makes DB HA very easy. RDS uses synchronous block level replication across availability zones and fails over using DNS…

regards

Rob

@Webcyberrob thanks for your valuable response. So you mean to say that DBs should be in master slave mode and DNS to be the point of contact for the application to connect with DB.
As it would be master slave mode another site DB will be getting replicated.
Now application of both sites will be connected to DNS which intern will be connected to master only . Hence it isolation level requirement will not be violated . In case of failure of DB of site 1 , DNS will take care to point to slave and continue execution there on . Is my understanding correct ?

Hi Anisk,

Yes, thats about right. thts effectively the way AWS does failover. If you do it yourself, you have to (1) detect the master has failed, (2) Effect a DNS update to redirect to the slave

This can actually be quite tricky and you nedd to be careful not to run as a ‘split brain’, particularly if you have automatic failover and fail back…

BTW, a long time ago, the Oracle RAC JDBC drver supported a cluster setup with failover and retry baked into the JDBC diver…

Ultimately, HA clustering is challenging to work well in all scenarios, so the more you can buy (eg AWS RDS or similar, the easier your life will be :wink:

regards

Rob