SQL server - performance improvements

PetrF · April 6, 2021, 3:13pm

Hello,
in our company we consider to use Camunda. We did some performance tests with Camunda using SQL server 2019 and PostgreSQL 11.
We are using REST API and .NET client. We use mainly external tasks and fetch and lock approach.

We observed much better performance with PostgreSQL. The better performance is probably caused by using much effective SQL commands with PosgreSQL than with SQL server.
When we started like 100 processes per second the performance of SQL server was very bad very quickly. Performance with PostgreSQL was stable for several days.
Main DB engine used in our company is SQL server and it would be very hard to get permission to use PostgreSQL.

Is it possible for you guys focus on improving SQL server performance?

Currently queries for SQL server are using row_number and rank functions to get subset of data e.g. external tasks. It is slow and not scaling well.
LIMIT is used by Postgre which provides much better performance.
SQL server 2012 supports OFFSET and FETCH which do the same like LIMIT in PostgreSQL.
This you can see in DbSqlSessionFactory when you look for method databaseSpecificInnerLimitAfterStatements.

Thanks ahead for your answer.
All the best.
Petr

Niall · April 8, 2021, 12:56pm

Hi @PetrF
Welcome to the forum.
We’ve also notices better performance when testing against Postgres than SQL server. Are you able to isolate more specifically what you suppose is going on on Camunda’s side that is cause performance issues.

Are you interested in creating a pull request to fix the issues that you find?

PetrF · April 13, 2021, 12:22pm

Hi @Niall
thanks for your reply. I can provide an example of such query optimization.
I would love to do a pull request but I’m a .NET developer with very little experience with java.

Example:
Camunda API has fetch and lock method.
This method produces query similar to a following one, see sql mapping.

declare @LockExpTime DATETIME = '2021-04-12 19:30'
declare @start int = 0
declare @end int = 10

SELECT SUB.* FROM ( 
	select RES.* , row_number() over (ORDER BY RES.PRIORITY_ desc) rnk FROM ( 
		select distinct RES.* from ( 
			select RES.*, PI.BUSINESS_KEY_, PD.VERSION_TAG_ 
			from ACT_RU_EXT_TASK RES 
			left join ACT_RU_EXECUTION PI on RES.PROC_INST_ID_ = PI.ID_ 
		    inner join ACT_RE_PROCDEF PD on RES.PROC_DEF_ID_ = PD.ID_ 
			WHERE (RES.LOCK_EXP_TIME_ is null or RES.LOCK_EXP_TIME_ <= @LockExpTime) 
			  and (RES.SUSPENSION_STATE_ is null or RES.SUSPENSION_STATE_ = 1) 
			  and (RES.RETRIES_ is null or RES.RETRIES_ > 2) 
			  and ( RES.TOPIC_NAME_ like '%' ) 
		) RES 
	)RES 
) SUB 
WHERE SUB.rnk >= @start AND SUB.rnk < @end ORDER BY SUB.rnk

Such query could be optimized in SQL server 2012 and above to something like this:

declare @LockExpTime DATETIME = '2021-04-12 19:30'
declare @start int = 0
declare @end int = 10

	
select distinct RES.* from ( 
	select RES.*, PI.BUSINESS_KEY_, PD.VERSION_TAG_ 
	from ACT_RU_EXT_TASK RES 
	left join ACT_RU_EXECUTION PI on RES.PROC_INST_ID_ = PI.ID_ 
	inner join ACT_RE_PROCDEF PD on RES.PROC_DEF_ID_ = PD.ID_ 
	WHERE (RES.LOCK_EXP_TIME_ is null or RES.LOCK_EXP_TIME_ <= @LockExpTime) 
		and (RES.SUSPENSION_STATE_ is null or RES.SUSPENSION_STATE_ = 1) 
		and (RES.RETRIES_ is null or RES.RETRIES_ > 2) 
		and (RES.TOPIC_NAME_ like '%') 
) RES 
ORDER BY RES.PRIORITY_ desc
OFFSET @start ROWS 
FETCH NEXT @end - @start ROWS ONLY

Thanks for considering of such improvement in vNext .
All the best.
Petr

Yana · June 11, 2021, 8:37am

Created ticket: CAM-13529

system · January 30, 2024, 12:59pm