BUG: long polling request that is canceled and then started again causes Camunda tasks to get stuck

Working with long polling on external tasks

Seeing behaviour where if I kill the worker without any closing of the connection, when I up the worker again, the first new task generated by Camunda is assigned to my worker, but does not return a response to the worker. When Camunda generates a second task, the second task is returned as expected. After the lock timeout occurs on the first task it gets picked up by the worker again and no issues. If I set the asycReaponseTimeout to a shorter value, the first task will get picked up as soon as the worker starts up (I believe).

Is there some bahviour going on here with connections resuming on the Camunda side?

Thanks

Okay doing some more testing, and it seems like a bug, unexpected behaviour or at the very least undocumented behaviour and the expected functions to be run.

Steps to reproduce:

Using postman you can recreate this behaviour

Consider a BPMN such as external-task.bpmn (3.1 KB)

Using the Camunda Tomcat-7.10.0 Docker image (default configurations, nothing changed)

In postman:

  1. Deploy the BPMN to the camunda server.

  2. POST localhost:8088/engine-rest/external-task/fetchAndLock with the body of:

{
  "workerId": "worker",
  "maxTasks": 10,
  "usePriority": false,
  "asyncResponseTimeout": 60000,
  "topics": [
    {
      "topicName": "mytopic",
      "lockDuration": 300000
    }
  ]
}

Long polling starts.

  1. in camunda tasklist start a instance of the External Task process instance from above. It should be picked up by postman and it returns a result.

  2. Send the postman request again but after a moment press “cancel” to abruptly terminate the request.

  3. Send the request again, and then quickly go into Tasklist and start a new process instance for the External Task. The result will never be returned by Postman. BUT if you go into Cockpit you will see the task has been claimed by the worker named “worker”. You can start multiple instances and it will never return the result, but it will continue to lock the tasks but never return the result.

If you restart the camunda server the issue self-resolves.
If you wait the full duration of the long polling async response the issue ‘appears’ to self-resolve, but your tasks are all locked that were created during that time…

Okay found the bug…
https://app.camunda.com/jira/browse/CAM-9562

This is a pretty significant bug IMO… would be great for better visibility on this sort of thing…
This is a really large annoyance when doing development, as terminated connections would happen continually…

Hi Stephen,

this bug is fixed in the next patch level release which is scheduled for end of the month.

Cheers,
Tassilo

Glad to hear next minor release will be so soon :+1:t2::+1:t2: