[Community Extension] Lightweight Health Monitor for Camunda 7 Clusters - Seeking Feedback

Champa · November 10, 2025, 3:09pm

Hi everyone!

I’m excited to share a new community extension I’ve been working on: Camunda Health Monitor - a lightweight monitoring dashboard specifically designed for Camunda 7 clusters.

The Problem It Solves

Working with Camunda 7 for years, monitoring and observability has always been a painful challenge. I’ve tried many solutions - Datadog, Grafana+Prometheus, even custom log parsing with Promtail+Loki and building dashboards in Grafana.

Each approach solved some specific problems but came with significant overhead:

Multiple tools to configure and maintain in parallel
Generic infrastructure metrics that don’t speak “Camunda”
Complex integrations just to answer simple operational questions
Heavy resource footprint for what should be straightforward monitoring

The solutions always felt too heavy for what we actually needed.

What I really wanted to know at 3 AM when getting paged:

How many stuck process instances do we have?
Which nodes are falling behind in job acquisition?
Are there pending message/signal correlations that might indicate issues?
What’s the actual JVM health across our cluster in the context of Camunda workload?

There is a wider solution implemented for enterprise platform, but I decided to build something lightweight open-source additionally. I’ve created this open-source version for the community - a focused monitoring tool that does one thing well: monitor Camunda clusters without the enterprise overhead.

What It Does

The monitor provides real-time visibility into:

Multi-node cluster health - Monitor all your Camunda nodes from one dashboard
Process execution metrics - Active instances, user tasks, external tasks
Job executor insights - Throughput, failed jobs, acquisition rates per node
Incident tracking - Real-time incident monitoring with drill-down
JVM metrics - Heap, GC, CPU, threads (via JMX Exporter or Micrometer)
Database analytics - Storage usage, slow queries, archivable instances
Prometheus export - Integration with your existing monitoring stack

Built with Flask, Alpine.js, and Tailwind CSS - it’s truly lightweight and can run alongside your existing infrastructure.
Note on architecture: This open-source version is intentionally simplified for ease of adoption and contribution. For production-grade commercial deployments, you’d typically want Flask blueprints for better code organization, a proper Tailwind build pipeline, etc. But for a monitoring tool that “just works with no overhead” this streamlined approach hits the sweet spot.

Quick Start

git clone https://github.com/bibacrm/camunda-health-monitor.git
cd camunda-health-monitor
cp .env.example .env  # Configure your Camunda nodes
pip install -r requirements.txt
python app.py

Docker deployment is also supported for production use.

Architecture Philosophy

This project intentionally uses a simplified architecture to lower the barrier for contributions:

Single-file Flask app (no blueprints) - Easy to understand the entire codebase
CDN-loaded Tailwind (no build pipeline) - No Node.js required
Inline Alpine.js - No JavaScript bundling

“Is this production-ready?”
Yes, for small-to-medium deployments (1-10 nodes). The architecture prioritizes:

Quick deployment (minutes, not hours)
Easy customization (add your metrics without fighting a framework)
Low resource footprint (<100MB memory)

For large-scale or highly customized deployments, you might want to add:

Flask blueprints for better code organization
Tailwind build pipeline for optimized CSS
Redis caching for high-traffic scenarios
Comprehensive test coverage

The simplified approach is intentional - it makes the project more accessible to contributors while remaining perfectly viable for production monitoring.

Questions for the Community

I’d love to hear your thoughts on:

What metrics matter most to you when monitoring Camunda in production?
What’s missing? Are there specific operational insights you need?
Integration needs - What tools do you typically integrate with? (Grafana, PagerDuty, etc.)
Camunda 8 support - Is there interest in adapting this for Camunda 8 clusters?

Repository & Documentation

GitHub: GitHub - bibacrm/camunda-health-monitor: Lightweight monitoring dashboard for Camunda 7 clusters with real-time metrics, JVM health tracking, and database analytics. Built with Flask, Alpine.js, and Tailwind CSS. · GitHub

The README includes:

Complete setup instructions
Docker deployment guide
API documentation
Prometheus metrics export

License Note

The project uses an Apache 2.0 license.

Contributing

This is a community project, and I welcome contributions! Whether it’s:

Bug reports and feature requests
Pull requests for new metrics or improvements
Documentation improvements
Testing with different Camunda configurations

Looking forward to your feedback and hoping this helps the community! If you find it useful, a GitHub star would be much appreciated

Built with for the Camunda community