Hi Ivgi,
I am checking the engine code in Camunda 7.13.
Looking at your stack trace and the java code, I found that each deferred request gets added to a blocking queue. The limit for this queue is up to 200 (class org.camunda.bpm.engine.rest.impl.FetchAndLockHandlerImpl:62):
protected BlockingQueue<FetchAndLockRequest> queue = new ArrayBlockingQueue<>(200);
Requests are added when they are deferred if the asyncResponseTimeout is not null and there are not tasks to be returned (class org.camunda.bpm.engine.rest.impl.FetchAndLockHandlerImpl:320-331) :
if (result.wasSuccessful()) {
List<LockedExternalTaskDto> lockedTasks = result.getTasks();
if (!lockedTasks.isEmpty() || dto.getAsyncResponseTimeout() == null) { // response immediately if tasks available
asyncResponse.resume(lockedTasks);
LOG.log(Level.FINEST, "Resuming request with {0}", lockedTasks);
} else {
addRequest(incomingRequest);
LOG.log(Level.FINEST, "Deferred request");
}
}
If the queue is full, then you get this exception that you saw (class org.camunda.bpm.engine.rest.impl.FetchAndLockHandlerImpl:229-236):
protected void addRequest(FetchAndLockRequest request) {
if (!this.queue.offer(request)) {
AsyncResponse asyncResponse = request.getAsyncResponse();
this.errorTooManyRequests(asyncResponse);
}
this.condition.signal();
}
It seems that your requests are filling up that queue, making the server unable to serve your workers anymore. The queue should get emptied out in the next run of the acquire() method, which cleans the queue, copies the content to another list, and iterates over it to respond to each caller. I would also investigate how your Camunda database is handling that load. Maybe you can activate some logging that could help you understand further what is going on (https://docs.camunda.org/manual/latest/user-guide/logging/), such as:
org.camunda.bpm.engine.cmd
org.camunda.bpm.engine.externaltask
org.camunda.bpm.engine.impl.persistence.entity.ExternalTaskEntity
I couldn’t find a way to overwrite this property. The best is to check if you can find something going on with your Camunda instance, or to scale your Camunda nodes horizontally so they can serve more workers.
Regards.