Performance Issue with fetchAndLock Query and Indexing

Hello All,

When using the fetchAndLock API to retrieve 100,000 external tasks for a specific topic, I have observed that the API call takes more than 30 seconds to respond. Upon investigation, I have identified the primary bottleneck in the database query used to fetch the external tasks from the act_ru_ext_task table.

The query involves a condition on the lock_exp_time_ column.

Condition: (RES.LOCK_EXP_TIME_ is null or RES.LOCK_EXP_TIME_ <= #{, jdbcType=TIMESTAMP})

Unfortunately, the lock_exp_time_ column does not have an index. As a result, the database needs to scan the entire table to locate the relevant records, leading to a significant delay in query execution time.

To address this issue, I took the step of creating an index on the lock_exp_time_ column. However, I discovered that the index is not being utilized for the condition involving NULL values check, and lock_exp_time_ will have NULL values until it will not fetch for execution.

To overcome this challenge, I would like to propose a solution:

We should initialize the lock_exp_time_ with the current time when creating an external task in act_ru_ext_task. This approach ensures that the lock_exp_time_ column never contains NULL values, enabling the use of an index for query execution.

I have already conducted tests to compare the query execution time before and after implementing this solution. Here are the results:

Camunda Version: 7.13.0
Database Version: Postgres 14.7

  1. Query execution time without an index on lock_exp_time_ and containing NULL values in the act_ru_ext_task table:

    Planning Time: 4.010 ms
    Execution Time: 30850.487 ms

  2. Query execution time with an index on lock_exp_time_ and initialize it with the current time when creating an external task:

    Planning Time: 2.310 ms
    Execution Time: 0.139 ms

1 Like