Seeking feedback on proposed breaking change in Zeebe Node client 1.0.0

With 1.0.0 scheduled to release in Q2 2021, I’m updating the Zeebe Node client to deal with the breaking changes.

At the moment, I have written a deprecation layer that means that pre-1.0 code written using the Zeebe Node client can just update the package dependency to 1.0.0 and run existing applications against the 1.0.0 broker, with no code changes.

I have a proposal for a breaking change in the 1.0.0 Node client, though, and would like to get feedback from users.

It’s documented in https://github.com/camunda-community-hub/zeebe-client-node-js/issues/210.

In essence:

This is an exhaustiveness check on the worker callback handler to ensure that all code paths take responsibility for the job that was activated.

At the moment, your IDE cannot warn you if you have code paths in the worker that result in the worker doing nothing until the broker times out the job activation. This breaking change forces the worker’s job handler to return the result of a call to the broker to complete, fail, or error the job, or signal that responsibility for this has been forwarded to an external system.

The IDE can then warn if there are code paths in the worker that don’t do one of these.

This will break existing code, where the worker job handler returns void.

So this existing code:

zbc.createWorker<any>({
    taskType: 'process-favorite-albums',
    taskHandler: (job, complete, worker) => {
        const { name, albums = [] } = job.variables
        worker.log(`${name} has the following albums: ${albums?.join?.(', ')}`)
       complete.succes()
    },
    fetchVariable: ['name', 'albums'],
})

Would need to be refactored to this:

zbc.createWorker<any>({
    taskType: 'process-favorite-albums',
    taskHandler: (job, complete, worker) => {
        const { name, albums = [] } = job.variables
        worker.log(`${name} has the following albums: ${albums?.join?.(', ')}`)
        // Must now return the call to complete for exhaustiveness check
        return complete.succes()
    },
    fetchVariable: ['name', 'albums'],
})

It is a code safety ergonomics change, and with the 1.0.0 release it’s the right time to introduce a breaking change. So far, I’ve avoided all the other breaking changes in 1.0.0 by writing a translation layer and deprecating the previous gRPC API surface. So, this change would be categorical - instead of a drop-in replacement, moving to 1.0 would require some refactoring.

Is it worth it? Is zero-refactoring transition to 1.0 worth leaving this safety feature out?

Please feel free to leave any thoughts here or in the GitHub issue.

  • Josh

What about just debug warn… Worker do not return any result in required time? Timeout or bad code design.

1 Like

Thanks! I changed the naming for the feature based on your suggestion.

API v0 - this worker is badly designed and can time out:

Screen Shot 2021-03-05 at 7.08.05 PM

API v1 (proposed) - that code is no longer valid (also using new APIv1 completion method):
Screen Shot 2021-03-05 at 7.11.43 PM

Error message gives some hint to the cause:

I can see how to do it via the type system as a static check.

To do it as a warning at design time, the only thing I can think of at the moment is to write an ESLint rule, and users would need to install both eslint and the rule. That’s not a very reliable, or simple, way to add it…

I still do not understand what happens if worker calls some api… for long time… more than timeout. All path are correct and no return bacause worker takes too much time to return data.

1 Like

Then it will timeout. There is no way to reason about the time taken in the type system.

That will depend on the runtime behaviour. This check, on the other hand, detects code paths that will always time out.

It is analogous to an exhaustiveness check for a switch / case statement.