We have deployed several Camunda services in our environment as Docker containers, including:
Zeebe
Optimize
Operate
Tasklist
Web Modeler
Connector
Identity
We are now planning to scale the system by running multiple replicas of each service. As part of this, we want to implement tracing to help identify performance bottlenecks across the setup.
We’re looking for a solution similar to what OpenTelemetry (OTel) offers for Java services—providing visibility into database queries, external service calls, and other dependencies.
We’re open to suggestions on how to design and implement this for a production-grade setup that scales efficiently.
Hi @DeploySage, welcome to the forums! Scaling Camunda takes more configuration than just running multiple replicas of each container. I would recommend reviewing the guides in our docs, which also talk a bit about observability and monitoring.
Thanks @nathan.loding , we are consistently working on it
I have gone through the documents provided, But I’m still confused to how to monitor camunda services which I have mentioned earlier, are there any environment variables I need to send for camunda services to expose the metrics? and traces?
@DeploySage - unfortunately that isn’t enough information. What are you trying to monitor, exactly? Resource usage is handled by the platform you deploy it to, and the metrics documentation page I linked to previously has information on enabling OpenTelemetry in the stack. Did you try those settings and they aren’t working?
Yeah I have tried the mentioned config for metrics, and for the zeebe the metrics are being exposed, but is it the same for other camunda services also like
Optimize
Operate
Tasklist
Web Modeler
Connector
Identity
Zeebe is exposing the metrics on :9600/metrics, Need some clarity around these, and how can I do the same
Thank you