Oauth token refresh can trigger rate limiting and log errors

I have observed that my Zeebe client app will periodically emit error logs with Failed while fetching credentials: exception=java.io.IOException: Failed while requesting access token with status code 429 and message Too Many Requests.

I don’t see any other adverse effects besides the error logs, and the application seems to be able to continue to handle jobs afterwards, but these error logs are forcing our team to silence error alerting that we have set up for our client app.

I have already raised an issue in github with more detailed information: Oauth token refresh can trigger rate limiting and log errors · Issue #13832 · camunda/zeebe · GitHub

Hi @Andy_Katz - just want to make sure I understand this issue correctly, You are seeing these errors in a Java application that is using the zeebe-client to communicate with a SaaS instance? What version of the zeebe-client is your application using?

We are using zeebe-client-java version 8.2.11

Thanks @Andy_Katz. I chatted with our engineers and what would really help is finding some way to reliably reproduce the issue - or at least, somewhat reliably, even if it takes a few tries. Do you have any thoughts on ways to try to reproduce it?

I think the questions I answered on the github issue here are probably good insights on how to reproduce this.

In our test environment we have around 150 different individual worker handlers across multiple instances of our client app. If you stand up that many workers all using the same client token, they will eventually need to refresh their tokens and the number of concurrent refresh requests seem to trigger the rate limit.

That being said, I have not seen any rate limit issues in the last week (our log visibility expires after a week) so that might not be enough workers. But in theory the more workers you have, the more likely it would be that will trigger a rate limit.

I read that GitHub thread but I misunderstood part of it. That makes a lot more sense now. I’ll reconnect with the engineering team this week and see what they think about next steps!

By the way, @nathan.loding I have just been granted access to the JIRA Support project by our Enterprise support partner. Would it be helpful for me to open a support ticket with these details?

@Andy_Katz - apologies for a late reply! Yes, I think opening an enterprise support ticket is a great option, because that will help prioritize the issue and bring it closer to the top of the queue!