Tasklist Search Api Performance Issue

Mohamed_Ahmed_Mohame · January 23, 2025, 11:24am

Hello All

Kindly for you advice @Niall

We are using Camunda 8.5 self-managed
and we are facing performance issue with the tasklist search Api when we try to make a load test with 10 users, ramp up for 10 sec, and duration 15 min through postman.

and kindly note that the error rate is increasing with the number of requests sent to the tasklist at the same time.

nathan.loding · January 23, 2025, 8:33pm

Hi @Mohamed_Ahmed_Mohame - please do not tag people unless they’ve already responded to your thread. This isn’t an official/priority support channel, it’s a community forum; if you need more immediate assistance we recommend opening a support ticket.

There are a number of possible causes, but the first I might look at is resource utilization for your Tasklist pod/deployment, as well as resource utilization for Elastic (because Tasklist is querying Elastic to fetch the data). Can you share anything about your deployment (how you’ve deployed it, where, what resources, how you’ve configured it, etc.)? Also curious if you see issues with other Tasklist endpoints, or endpoints of other apps (for instance, Operate)?

Mohamed_Ahmed_Mohame · January 29, 2025, 10:47am

Hi @nathan.loding

We are deploying our camunda on GKE
and these are the specs of the camunda component:

Operate : 2 cpu / 2G memory
Tasklisk : 1 cpu / 2G memory
Zeebe : 1 cpu / 2G memory

Please let me know if you need further information
Thank you

nathan.loding · January 29, 2025, 1:23pm

Hi @Mohamed_Ahmed_Mohame - what does the resource utilization for the Tasklist and Elasticsearch instances look like? When you start to see errors are those peaking? Do you have issue with other APIs (for instance, Operate), or only Tasklist?

Mohamed_Ahmed_Mohame · February 9, 2025, 8:21am

Hi @nathan.loding

When we try the performance testing with 10 concurrent users, 20 sec ramp up for 15 minutes, we faced the following issue:
All Elasticsearch shards are down with the following exception

and here you can find the request sent to the Tasklist search endpoint

{
    "candidateGroups": [
        "Role1",
        "Role2",
        "Role3",
        "Role4",
        "Role5",
        "Role6",
        "Role7",
        "Role8",
        "Role9",
        "Role10",
        "Role11",
        "Role12",
        "Role13"
    ],
    "includeVariables": [
        {
            "name": "variable1"
        },
        {
            "name": "variable2"
        },
        {
            "name": "variable3"
        },
        {
            "name": "variable4"
        },
        {
            "name": "variable5"
        },
        {
            "name": "variable6"
        },
        {
            "name": "variable7"
        },
        {
            "name": "variable8"
        }
    ],
    "pageSize": 10,
    "sort": [
        {
            "field": "creationTime",
            "order": "DESC"
        }
    ],
    "state": "CREATED",
    "taskVariables": [
        {
            "name": "statusCode",
            "value": "\"statusCode1\"",
            "operator": "eq"
        },
        {
            "name": "productCode",
            "value": "\"productCode1\"",
            "operator": "eq"
        }
    ]
}

and the response returned is in the following image

Mohamed_Ahmed_Mohame · February 9, 2025, 8:40am

Kindly note that when we add the variables to the body in the search endpoint, we face the following issue but when we exclude them the request is faster and not returning the same error

Mohamed_Ahmed_Mohame · February 9, 2025, 1:25pm

We start to see the errors after about 5 minutes from the 10 users are using the tasklsit.
and for the second question, actually we don’t use the operate API and we aren’t facing any issue with the operate client.

The same bug appeared in the tasklist client when I try to search using variable but when I remove the variable from the task search the request is completed successfully

nathan.loding · February 12, 2025, 6:56pm

@Mohamed_Ahmed_Mohame - the error that the shards are down seems like the first path to investigate deeply. This is why I asked about resource usage in Elastic. Are you seeing Elastic max its resources? Are you seeing errors in Elastic?

system · May 13, 2025, 6:56pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.