Hi everyone! ![]()
I’m excited to share a new community extension I’ve been working on: Camunda Health Monitor - a lightweight monitoring dashboard specifically designed for Camunda 7 clusters.
The Problem It Solves
Working with Camunda 7 for years, monitoring and observability has always been a painful challenge. I’ve tried many solutions - Datadog, Grafana+Prometheus, even custom log parsing with Promtail+Loki and building dashboards in Grafana.
Each approach solved some specific problems but came with significant overhead:
- Multiple tools to configure and maintain in parallel
- Generic infrastructure metrics that don’t speak “Camunda”
- Complex integrations just to answer simple operational questions
- Heavy resource footprint for what should be straightforward monitoring
The solutions always felt too heavy for what we actually needed.
What I really wanted to know at 3 AM when getting paged:
- How many stuck process instances do we have?
- Which nodes are falling behind in job acquisition?
- Are there pending message/signal correlations that might indicate issues?
- What’s the actual JVM health across our cluster in the context of Camunda workload?
There is a wider solution implemented for enterprise platform, but I decided to build something lightweight open-source additionally. I’ve created this open-source version for the community - a focused monitoring tool that does one thing well: monitor Camunda clusters without the enterprise overhead.
What It Does
The monitor provides real-time visibility into:
Multi-node cluster health - Monitor all your Camunda nodes from one dashboard
Process execution metrics - Active instances, user tasks, external tasks
Job executor insights - Throughput, failed jobs, acquisition rates per node
Incident tracking - Real-time incident monitoring with drill-down
JVM metrics - Heap, GC, CPU, threads (via JMX Exporter or Micrometer)
Database analytics - Storage usage, slow queries, archivable instances
Prometheus export - Integration with your existing monitoring stack
Built with Flask, Alpine.js, and Tailwind CSS - it’s truly lightweight and can run alongside your existing infrastructure.
Note on architecture: This open-source version is intentionally simplified for ease of adoption and contribution. For production-grade commercial deployments, you’d typically want Flask blueprints for better code organization, a proper Tailwind build pipeline, etc. But for a monitoring tool that “just works with no overhead” this streamlined approach hits the sweet spot.
Quick Start
git clone https://github.com/bibacrm/camunda-health-monitor.git
cd camunda-health-monitor
cp .env.example .env # Configure your Camunda nodes
pip install -r requirements.txt
python app.py
Docker deployment is also supported for production use.
Architecture Philosophy
This project intentionally uses a simplified architecture to lower the barrier for contributions:
- Single-file Flask app (no blueprints) - Easy to understand the entire codebase
- CDN-loaded Tailwind (no build pipeline) - No Node.js required
- Inline Alpine.js - No JavaScript bundling
“Is this production-ready?”
Yes, for small-to-medium deployments (1-10 nodes). The architecture prioritizes:
- Quick deployment (minutes, not hours)
- Easy customization (add your metrics without fighting a framework)
- Low resource footprint (<100MB memory)
For large-scale or highly customized deployments, you might want to add:
- Flask blueprints for better code organization
- Tailwind build pipeline for optimized CSS
- Redis caching for high-traffic scenarios
- Comprehensive test coverage
The simplified approach is intentional - it makes the project more accessible to contributors while remaining perfectly viable for production monitoring.
Questions for the Community
I’d love to hear your thoughts on:
- What metrics matter most to you when monitoring Camunda in production?
- What’s missing? Are there specific operational insights you need?
- Integration needs - What tools do you typically integrate with? (Grafana, PagerDuty, etc.)
- Camunda 8 support - Is there interest in adapting this for Camunda 8 clusters?
Repository & Documentation
The README includes:
- Complete setup instructions
- Docker deployment guide
- API documentation
- Prometheus metrics export
License Note
The project uses an Apache 2.0 license.
Contributing
This is a community project, and I welcome contributions! Whether it’s:
- Bug reports and feature requests
- Pull requests for new metrics or improvements
- Documentation improvements
- Testing with different Camunda configurations
Looking forward to your feedback and hoping this helps the community! If you find it useful, a GitHub star would be much appreciated ![]()
Built with
for the Camunda community