DB Utilization Hitting 100% due to Camunda Engine Query

HI,

Please can any of the experts here in this forum help me understand the below challenge

We are observing the below quety in slow query logs whenever the DB Utilization hits 100%

Can you please advise die to which Camunda engine API this query is internally called

select distinct RES.* from db_camunda.ACT_RU_EXECUTION RES
inner join db_camunda.ACT_RE_PROCDEF P on RES.PROC_DEF_ID_ = P.ID_
WHERE RES.PARENT_ID_ is null and P.KEY_ = ‘ChildWIProcess’
and RES.SUSPENSION_STATE_ = 1
and EXISTS ( select ID_ from db_camunda.ACT_RU_VARIABLE WHERE NAME_= ‘WID’
and RES.ID_ = EXECUTION_ID_ and (( TYPE_ is not null and TYPE_ = ‘string’ and TEXT_ is not null and TEXT_ LIKE ‘%20200811173647390-4%’ ESCAPE ‘\\’ )
and EXISTS ( select ID_ from db_camunda.ACT_RU_VARIABLE
WHERE NAME_= ‘ProcessStatus’ and RES.ID_ = EXECUTION_ID_ and NOT ( ( TYPE_ is not null and TYPE_ = ‘string’ and TEXT_ is not null and TEXT_ = ‘COMPLETED’ ) ) ) and
EXISTS ( select ID_ from db_camunda.ACT_RU_VARIABLE WHERE NAME_= ‘ProcessStatus’ and RES.ID_ = EXECUTION_ID_
and NOT ( ( TYPE_ is not null and TYPE_ = ‘string’ and TEXT_ is not null and TEXT_ = ‘REJECTED’ ) ) ) order by RES.BUSINESS_KEY_ desc
LIMIT 2147483647 OFFSET 0

@Gaga what’s the db capacity and available storage in the db?

The DB capacity is m4.4xlarge and the total storage capacity is 181 GB currently.

Can you please provide any hint as to through which API this query is getting called

This is the only query we are able to identify in slow query logs which is taking longer time to execute.

@Gaga Seems like you’re using RDS/Aurora instance of MySQL?

This query trying to fetch record from ACT_RU_EXECUTION table with criteria as ProcessStatus = “COMPLETED” && ProcessStatus = “REJECTED” variables.

Whenever querying for process/task/execution/activity instances with process variables will take extra time, because process variables are not indexed, so query will scan entire table, and thus result in slowness. Also creating index for process variables is expensive and write/update operation will also affect.

Do you have condition boundary event in your process?

Thanks .

We are using mySQL DB

No we don’t have condition boundary event in the process but we have one normal condition event in the process flow.

Can you please suggest me any workaround please

Also can you please suggest

We move to Aurora DB in lower environment to test the DB scaling on fly but we are getting the following error although we tried enab;ing /disabling the forced scalinga s well

Caused by: java.sql.SQLException: The last transaction was aborted due to Seamless Scaling. Please retry.
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:965)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3978)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3914)

I hope you’ve set the minimum and maximum capacity unit for the DB cluster. Aurora Serverless can reduce capacity down to the minimum or increase capacity to maximum capacity unit.

Under the following conditions, however, Aurora Serverless might not be able to find a scaling point:

  • Long-running queries or transactions are in progress.
  • Temporary tables or table locks are in use.

If either of the above cases happens, Aurora Serverless continues to try to find a scaling point so that it can initiate the scaling operation (unless “Force Scaling” is enabled). It does this for as long as it determines that the DB cluster should be scaled.

Maybe you can give a try by selecting “Forced Scaling”.

Note: Amazon Aurora Serverless auto scaling is a vertical scaling solution, where a more powerful instance takes over an inferior instance, utilizing the underlying Aurora shared storage technology efficiently.

That query got executed when the execution arrives at the conditional event. Maybe we can take a look at your process, if you provide it.

Also check for this query:

SELECT @@hostname as hostname, @@max_connections AS max_connections, 
(SELECT VARIABLE_VALUE FROM global_status WHERE VARIABLE_NAME = 'THREADS_CONNECTED') AS threads_connected,
(SELECT VARIABLE_VALUE FROM global_status WHERE VARIABLE_NAME = 'UPTIME') AS uptime;

Thanks aravindhrs

on DB Scaling:

We tried with forced Scaling but what we observe is that the DB CPU remains at peak and Aurora does not scale it remains at the same

if the transaction fail errors come can you please guide how do we identify the error in the aurora log

Is there a way to automate the retry of these failed transaction. Ours is a custom spring boot application with embedded camunda engine.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

I will try and run this query to see the outcome

The below is the query outcome. Can you please help me understand what do we come to know from the below query results

hostname max_connections threads_connected uptime
ip-172-27-0-183 5351 65 24087350

Thanks

@Gaga haven’t you setup the datasource and connection pooling for db connection?

I will check with my internal DB team. How much it is supposed to be

If they have not set guess it will take the default numbers