Camunda Zeebe Client Error: io.grpc.StatusRuntimeException: UNAVAILABLE: unavailable

aravindhrs · September 28, 2023, 9:42am

We were using Camunda 8.2.5 version, and we are getting this error. It happens for all the workers. Any idea like what causes this issue? How can we fix this issue?

2023-09-28 05:43:19.233 WARN 1 --- [lt-executor-187] io.camunda.zeebe.client.job.poller : Failed to activate jobs for worker <<job_worker>> and job type <<job_type>>
io.grpc.StatusRuntimeException: UNAVAILABLE: unavailable
at io.grpc.Status.asRuntimeException(Status.java:539) ~[grpc-api-1.54.2.jar!/:1.54.2]
at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:487) ~[grpc-stub-1.54.2.jar!/:1.54.2]
at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:576) ~[grpc-core-1.54.2.jar!/:1.54.2]
at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:70) ~[grpc-core-1.54.2.jar!/:1.54.2]
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:757) ~[grpc-core-1.54.2.jar!/:1.54.2]
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:736) ~[grpc-core-1.54.2.jar!/:1.54.2]
at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) ~[grpc-core-1.54.2.jar!/:1.54.2]
at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133) ~[grpc-core-1.54.2.jar!/:1.54.2]
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[na:na]
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[na:na]
at java.base/java.lang.Thread.run(Thread.java:833) ~[na:na]

nathan.loding · September 28, 2023, 2:01pm

Hi @aravindhrs - the error indicates that the worker cannot connect to the GRPC gateway. That could be cause by quite a number of different issues. Are you sure the services are all running and healthy? Are there any firewalls (or other networking needs) between the worker and Zeebe that need to be reviewed? Is your ingress running (or, if not using an ingress controller, is all your port forwarding configured)?

aravindhrs · September 28, 2023, 6:26pm

Hi @nathan.loding Thanks for looking into this. Giving you the more context here.

We have ClientXA (Spring Zeebe Client ==> 5 workers) and ClientXB (Spring Zeebe Client ==> 10 workers) is up and running on version 8.2.5. Zeebe cluster was fine and ClientXA works fine and able to process the jobs. Issue was with ClientXB which is throwing the above error. ClientXB was working fine till yesterday and it throws error since today.

Also, Pods are up and running and healthy and ingress also perfect.

What could be the root cause for this issue?

nathan.loding · September 28, 2023, 7:18pm

It’s really hard to say. What I can say is that the io.grpc.StatusRuntimeException indicates a connectivity issue to the GRPC gateway. If you have access to the server where ClientXB is running, can you connect to Zeebe from the CLI? Or test the networking by pinging the gateway/etc.?

aravindhrs · October 4, 2023, 6:31am

Yes. We have tested it and working

aravindhrs · October 4, 2023, 6:32am

@nathan.loding Root cause can be different zeebe versions?

aravindhrs · October 4, 2023, 7:00am

Hi @nathan.loding, We have Zeebe cluster with version as below:

Environment A is having higher version than Environment B for Zeebe cluster as 8.2.5.

We are getting the below error for all the workers in Environment A, whereas workers in Environment B running normal without any errors.

2023-09-28 05:43:19.233 WARN 1 --- [lt-executor-187] io.camunda.zeebe.client.job.poller : Failed to activate jobs for worker <<job_worker>> and job type <<job_type>>
io.grpc.StatusRuntimeException: UNAVAILABLE: unavailable

Is this issue occurs due to higher version of Zeebe cluster being used in Environment A? If yes, could you please share how this version affects worker and throws this exception.

I checked the camunda docs for version matrix, it’s not much clear. If you could share the version matrix like above would be sufficient or any other better format you have.

Thanks.

nathan.loding · October 4, 2023, 3:53pm

I am not aware of any incompatibility that would cause that. Are you using the same version of the Zeebe client (and/or spring-zeebe) for both workers?

Are the Zeebe deployments identical other than the version (for instance, both are using TLS, etc.)?

And also the obvious question that wasn’t asked, but better safe to ask than assume: have you double checked the app properties and environment variables to ensure that the correct URL/IP is set for the cluster? Do you perhaps have a conflict between the application.properties and the environment variables?

aravindhrs · October 5, 2023, 4:25am

@nathan.loding , yes Zeebe client version same across all the workers. We have downgraded the Zeebe cluster version from 8.2.5 to 8.2.0, then there are no more zeebe client exceptions.

2023-09-28 05:43:19.233 WARN 1 --- [lt-executor-187] io.camunda.zeebe.client.job.poller : Failed to activate jobs for worker <<job_worker>> and job type <<job_type>>
io.grpc.StatusRuntimeException: UNAVAILABLE: unavailable

This exception got resolved now.

nathan.loding · October 5, 2023, 6:20pm

@aravindhrs I am glad the issue is resolved, but I’m not glad that it was downgrading Zeebe that fixed it! I will pass this along to our engineering team and see if they have any thoughts on what might be happening!

aravindhrs · October 6, 2023, 4:22am

@nathan.loding Thanks for your support. Please let us know if there are any updates. I was monitoring the worker logs after downgrade and it works fine.

system · October 13, 2023, 4:23am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.