Hello,
in our company we consider to use Camunda. We did some performance tests with Camunda using SQL server 2019 and PostgreSQL 11.
We are using REST API and .NET client. We use mainly external tasks and fetch and lock approach.
We observed much better performance with PostgreSQL. The better performance is probably caused by using much effective SQL commands with PosgreSQL than with SQL server.
When we started like 100 processes per second the performance of SQL server was very bad very quickly. Performance with PostgreSQL was stable for several days.
Main DB engine used in our company is SQL server and it would be very hard to get permission to use PostgreSQL.
Is it possible for you guys focus on improving SQL server performance?
Currently queries for SQL server are using row_number and rank functions to get subset of data e.g. external tasks. It is slow and not scaling well.
LIMIT is used by Postgre which provides much better performance. SQL server 2012 supports OFFSET and FETCH which do the same like LIMIT in PostgreSQL.
This you can see in DbSqlSessionFactory when you look for method databaseSpecificInnerLimitAfterStatements.
Hi @PetrF
Welcome to the forum.
We’ve also notices better performance when testing against Postgres than SQL server. Are you able to isolate more specifically what you suppose is going on on Camunda’s side that is cause performance issues.
Are you interested in creating a pull request to fix the issues that you find?
Hi @Niall
thanks for your reply. I can provide an example of such query optimization.
I would love to do a pull request but I’m a .NET developer with very little experience with java.
declare @LockExpTime DATETIME = '2021-04-12 19:30'
declare @start int = 0
declare @end int = 10
SELECT SUB.* FROM (
select RES.* , row_number() over (ORDER BY RES.PRIORITY_ desc) rnk FROM (
select distinct RES.* from (
select RES.*, PI.BUSINESS_KEY_, PD.VERSION_TAG_
from ACT_RU_EXT_TASK RES
left join ACT_RU_EXECUTION PI on RES.PROC_INST_ID_ = PI.ID_
inner join ACT_RE_PROCDEF PD on RES.PROC_DEF_ID_ = PD.ID_
WHERE (RES.LOCK_EXP_TIME_ is null or RES.LOCK_EXP_TIME_ <= @LockExpTime)
and (RES.SUSPENSION_STATE_ is null or RES.SUSPENSION_STATE_ = 1)
and (RES.RETRIES_ is null or RES.RETRIES_ > 2)
and ( RES.TOPIC_NAME_ like '%' )
) RES
)RES
) SUB
WHERE SUB.rnk >= @start AND SUB.rnk < @end ORDER BY SUB.rnk
Such query could be optimized in SQL server 2012 and above to something like this:
declare @LockExpTime DATETIME = '2021-04-12 19:30'
declare @start int = 0
declare @end int = 10
select distinct RES.* from (
select RES.*, PI.BUSINESS_KEY_, PD.VERSION_TAG_
from ACT_RU_EXT_TASK RES
left join ACT_RU_EXECUTION PI on RES.PROC_INST_ID_ = PI.ID_
inner join ACT_RE_PROCDEF PD on RES.PROC_DEF_ID_ = PD.ID_
WHERE (RES.LOCK_EXP_TIME_ is null or RES.LOCK_EXP_TIME_ <= @LockExpTime)
and (RES.SUSPENSION_STATE_ is null or RES.SUSPENSION_STATE_ = 1)
and (RES.RETRIES_ is null or RES.RETRIES_ > 2)
and (RES.TOPIC_NAME_ like '%')
) RES
ORDER BY RES.PRIORITY_ desc
OFFSET @start ROWS
FETCH NEXT @end - @start ROWS ONLY
Thanks for considering of such improvement in vNext .
All the best.
Petr