A question about workers' long polling and getting multiple jobs at once

Jabowski · November 28, 2023, 3:17pm

Hi!

I’m fairly new to Camunda and right now I am trying to wrap my head around a few key concepts. A question I have concerns workers and the maxJobsActive/maxJobsToActivate parameter that can be set during worker start. The question is this:

If I can set the max amount of jobs a worker will activate when successfully polling, how does Zeebe decide when to close a poll?
The poll could
a) be closed as soon as a single job of the required type becomes available or
b) be kept open until further jobs of the type become available, possibly honoring the polls’ maxJobsToActivate parameter.

Is it as simple as keeping it open until either the request timeout is up or the number of jobs is reached? The Documentation says “The request is completed when at least one job becomes available.” (so no) - but even in a busy environment, (however little) time will have to pass for jobs to accumulate, right? So is there an arbitrary number of jobs that have to become available for Zeebe to collect and then close the poll?
How is this managed considering the implications for either throughput (when only giving out a single job for each poll) or latency (if a job is being kept active way longer than it would need to be while waiting for further jobs to arrive and close the poll)

Thank you for your responses,

Jan

nathan.loding · November 28, 2023, 3:34pm

Hi @Jabowski, welcome to the forums! I think one point of confusion might be the term “Zeebe”; Zeebe is the process engine itself, and job workers connect to the engine. Zeebe doesn’t have any specific concerns around maxActiveJobs for a worker, that’s for the worker to manage. (Note: these job worker details are specific to the Java/Spring clients.)

When you define a job worker you assign it a type to listen to; when the job worker starts, it polls the engine for jobs of that type. If no jobs are found, it waits for the pollInterval before it polls for jobs again. It continues this loop until there are jobs available. It then fetches up to the maxActiveJobs and begins processing those. As soon as there is less than or equal to 30% of the jobs left to execute, it then polls the engine to see if there are more jobs. This loop continues until all jobs are completed, then it waits for the pollInterval and the whole process repeats.

For instance, imagine a pollInterval of 10 seconds, and maxActiveJobs set to 3. There are no processes running in Zeebe, so when the job worker starts, there’s no jobs pending. Then 10 different processes start within that 10 second poll interval. The worker will then fetch 3 jobs and begin handling them. When 1 job is left to be handled, it polls for up to 2 additional jobs. It repeats this several more times until all 10 jobs are done. Then it waits 10 seconds before polling for jobs again.

Does that help?

Jabowski · November 28, 2023, 3:49pm

Hi @nathan.loding ,

thank you for your reply! It clears up a couple of things, yes

Correct me, if I’m wrong here, but the way you described it, the process does not employ any long polling, right? My confusion actually sits with scenarios of a worker having an open long poll and whether it would at all be possible to fetch multiple jobs through that request (because this would mean waiting for more than one job). If that weren’t the case, things would be much clearer.

nathan.loding · November 28, 2023, 4:51pm

@Jabowski - the process doesn’t do any polling. The job worker is fully external to the process engine, and the job worker itself polls for jobs. How the worker polls is really an implementation detail within your job worker. The engine does support long polling - if the worker is using long polling, it closes the request when there is “at least one job” available. It is possible for there to be multiple jobs depending on how many process instances are running concurrently.