Camunda Spring Client: Failed with code 503: 'Service Unavailable'

Hi @guentherwieserwrwks,

This 503 Service Unavailable error on /v2/jobs/activation is a known pattern that typically indicates the Orchestration Cluster REST API cannot successfully activate jobs, usually due to connectivity issues with brokers or temporary backpressure.

Based on your description and similar cases, here are the most likely causes and solutions:

1. Check the 503 Response Details

First, please examine the full 503 response body. If it contains RESOURCE_EXHAUSTED, this indicates backpressure and you should implement retries with exponential backoff as per the Activate jobs API documentation.

2. Gateway and Broker Connectivity Issues

Since this happens after initial success, check:

  • Gateway and broker health/logs for partition or connection errors
  • Ensure your Spring Boot client is pointing to the correct REST and gRPC addresses
  • Verify there are no context path misconfigurations

3. Proxy/Ingress Configuration

If you have nginx, ingress controllers, or other proxies between your Spring Boot app and the Camunda gateway:

  • Long-lived job activation connections may be getting cut by intermediaries
  • Solutions:
    • Align worker stream timeout with proxy gRPC timeouts
    • Or disable job streaming: set streamEnabled=false in your job worker configuration
    • Configure appropriate timeouts on your proxy/ingress

4. Job Worker Configuration Tuning

Try adjusting these settings in your Spring Boot application:

camunda:
  client:
    job:
      request-timeout: 30s  # Increase if seeing timeouts
      stream-enabled: false  # Disable streaming if proxy issues

Next Steps

To help diagnose this further, could you please share:

  1. The exact 503 response body (especially if it mentions RESOURCE_EXHAUSTED)
  2. Your current camunda.client configuration in Spring Boot
  3. Whether you have any proxies/ingress between your app and Camunda
  4. Any relevant logs from the gateway/broker pods

This will help pinpoint whether it’s a backpressure, connectivity, or proxy timeout issue.

References: