Hi,
zclient.newFailCommand(job)
.retries(job.getRetries()-1).retryBackoff(Duration.ofSeconds(Integer.parseInt(job.getCustomHeaders().get(“backoffTime”))))
.errorMessage("Could not retrieve money due to: " + e.getMessage())
.send()
.exceptionally(t → {throw new RuntimeException("Could not fail job: " + t.getMessage(), t);});
This approach for retrying is working fine in Saas, but when I tried the same using self managed its not working.
please help me on this.
Actually the retry mechanism is not working in self managed, the worker is tried once and the respective service task is marked as completed and token is moved forward to the next task.
In saas the worker is tried as many times as mentioned and after that the token is moving.
This is the expected behavior when the job can be completed in the first run. There must be no need to fail and retry afterwards.
And there is no difference between the SaaS and Self managed environment, as the software is delivered as container images and SaaS uses the same images we provide for Self managed.
I assume that you might be seeing why the newFailCommand sometimes work (create incident or retry) and sometimes completes (token moves to next step). From the above code snippet, I can say it’s because of the issuing an async command. Try using send().join() for sync calls and see if that’s the case.
Cheers!