External Task Retry Java

Hi Dear Camunders,

I am trying to implement external task retry strategy.
I followed the
https://docs.camunda.org/manual/7.10/user-guide/process-engine/external-tasks/#reporting-task-failure

externalTaskService.handleFailure(
  task.getId(),
  "externalWorkerId",
  "Address could not be validated: Address database not reachable",     // errorMessage
  "Super long error details",                                           // errorDetails
  1,                                                                    // retries
  10L * 60L * 1000L);                                                   // retryTimeout

where 1 sets number of retries

Documentation says:
A failure is reported for the locked task such that it can be retried once more after 10 minutes. The process engine does not decrement retries itself. Instead, such a behavior can be implemented by setting the retries to task.getRetries() - 1 when reporting a failure.

so I understand it that if I put task.getRetries() - 1 instead of “1” in number of retry, it should decrement retries and at the end potentionaly create incident, the question is how to set in such case the initial number of retries.

I am able to do so via REST, but I cant find a way how to do it with Java.
https://docs.camunda.org/manual/7.10/reference/rest/external-task/put-retries/

I am sorry for my Java skills they are not the best of the best.

M.

Hi @Michal_S,

here is an example for the handler in the external task client (https://github.com/camunda/camunda-external-task-client-java):

    subscriptionBuilder.handler((externalTask, externalTaskService) -> {
      Integer retries = externalTask.getRetries();
      if (retries == null) {
        retries = 3;
      }
      String content = externalTask.getVariable("content");
      String message = "Sorry, your tweet has been rejected: " + content;
      System.out.println(message);
      if (content.equals("retry!")) {
        externalTaskService.handleFailure(externalTask, "one more", "change content!", retries - 1, 10000);
      } else {
        Map<String, Object> variables = new HashMap<String, Object>();
        variables.put("message", message);
        externalTaskService.complete(externalTask, variables);
      }

You have to initialize the number of retries in the first call, then you can decrement it in every round. It’s easy to adopt the logic to any other client structure or language.

Hope this helps, Ingo

@Ingo_Richtsmeier
You are right this is fairly easy, but I thought it should be set upfront in the engine… ehm my mistake.
thank you

I am also trying to understand the behavior but slightly confused. Below is my scenario and would like to confirm the behavior.

  1. Initiated two external task client workers for the same topic with lock duration of 30 seconds. I reckon this means that each workers fetch and lock operation will be locked for 30 seconds. In other words, they will not fetch the task for the subscribed topic for 30 seconds
.lockDuration(30000) // 30 seconds
  1. Every time a retry needs to be done will set below values in handle failure, assuming

maxRetry = 2

.setRetryTimeout(2000) // 2 seconds
.setRetries(maxRetry - 1)

I also have asyncResponseTimeout () configured as 10000 i.e. 10 seconds

With retries and retryTimout , workers can specify a retry strategy. When setting retries to a value > 0, the task can be fetched again after retryTimeout expires. When setting retries to 0, a task can no longer be fetched and an incident is created for this task.

If above is applied to my scenario, then after 2 seconds, this task is available for fetching but it can only be fetched by a worker(s) if lock duration is expired? Is my understanding correct?

How asyncResponseTimeout affects it?

What values I need to use if retries needs to be done for 30 seconds?

Please help me to understand it.

Thanks

Hi @rawat,

  1. Initiated two external task client workers for the same topic with lock duration of 30 seconds. I reckon this means that each workers fetch and lock operation will be locked for 30 seconds. In other words, they will not fetch the task for the subscribed topic for 30 seconds

Correct.

If above is applied to my scenario, then after 2 seconds, this task is available for fetching but it can only be fetched by a worker(s) if lock duration is expired? Is my understanding correct?

When resuming with failure, the locktime is removed from the database and the 2 seconds start immediately after your reply with .handleFailure().

The next fetchAndLock will lock the task again for 30 seconds.

How asyncResponseTimeout affects it?

the asyncResonseTimeout() only reduces the latency for the polling interval. It has nothing to do with the lockDuration() and the retries.

So, your fetchAndLock request will be kept open for 10 seconds and if a new task is ready after 2 seconds from the retryTimeout, you will get the response after 2 seconds.

Hope this helps, Ingo

2 Likes

@Ingo_Richtsmeier Thanks a lot for the explanation.

Have made things clear and where to make timeout changes if required.