[Community Extension] Lightweight Health Monitor for Camunda 7 Clusters - Seeking Feedback

Hi everyone! :waving_hand:

I’m excited to share a new community extension I’ve been working on: Camunda Health Monitor - a lightweight monitoring dashboard specifically designed for Camunda 7 clusters.

:bullseye: The Problem It Solves

Working with Camunda 7 for years, monitoring and observability has always been a painful challenge. I’ve tried many solutions - Datadog, Grafana+Prometheus, even custom log parsing with Promtail+Loki and building dashboards in Grafana.

Each approach solved some specific problems but came with significant overhead:

  • Multiple tools to configure and maintain in parallel
  • Generic infrastructure metrics that don’t speak “Camunda”
  • Complex integrations just to answer simple operational questions
  • Heavy resource footprint for what should be straightforward monitoring

The solutions always felt too heavy for what we actually needed.

What I really wanted to know at 3 AM when getting paged:

  • How many stuck process instances do we have?
  • Which nodes are falling behind in job acquisition?
  • Are there pending message/signal correlations that might indicate issues?
  • What’s the actual JVM health across our cluster in the context of Camunda workload?

There is a wider solution implemented for enterprise platform, but I decided to build something lightweight open-source additionally. I’ve created this open-source version for the community - a focused monitoring tool that does one thing well: monitor Camunda clusters without the enterprise overhead.

:rocket: What It Does

The monitor provides real-time visibility into:

:white_check_mark: Multi-node cluster health - Monitor all your Camunda nodes from one dashboard
:white_check_mark: Process execution metrics - Active instances, user tasks, external tasks
:white_check_mark: Job executor insights - Throughput, failed jobs, acquisition rates per node
:white_check_mark: Incident tracking - Real-time incident monitoring with drill-down
:white_check_mark: JVM metrics - Heap, GC, CPU, threads (via JMX Exporter or Micrometer)
:white_check_mark: Database analytics - Storage usage, slow queries, archivable instances
:white_check_mark: Prometheus export - Integration with your existing monitoring stack

Built with Flask, Alpine.js, and Tailwind CSS - it’s truly lightweight and can run alongside your existing infrastructure.
Note on architecture: This open-source version is intentionally simplified for ease of adoption and contribution. For production-grade commercial deployments, you’d typically want Flask blueprints for better code organization, a proper Tailwind build pipeline, etc. But for a monitoring tool that “just works with no overhead” this streamlined approach hits the sweet spot.

:package: Quick Start

git clone https://github.com/bibacrm/camunda-health-monitor.git
cd camunda-health-monitor
cp .env.example .env  # Configure your Camunda nodes
pip install -r requirements.txt
python app.py

Docker deployment is also supported for production use.

:wrench: Architecture Philosophy

This project intentionally uses a simplified architecture to lower the barrier for contributions:

  • Single-file Flask app (no blueprints) - Easy to understand the entire codebase
  • CDN-loaded Tailwind (no build pipeline) - No Node.js required
  • Inline Alpine.js - No JavaScript bundling

“Is this production-ready?”
Yes, for small-to-medium deployments (1-10 nodes). The architecture prioritizes:

  1. Quick deployment (minutes, not hours)
  2. Easy customization (add your metrics without fighting a framework)
  3. Low resource footprint (<100MB memory)

For large-scale or highly customized deployments, you might want to add:

  • Flask blueprints for better code organization
  • Tailwind build pipeline for optimized CSS
  • Redis caching for high-traffic scenarios
  • Comprehensive test coverage

The simplified approach is intentional - it makes the project more accessible to contributors while remaining perfectly viable for production monitoring.

:thinking: Questions for the Community

I’d love to hear your thoughts on:

  1. What metrics matter most to you when monitoring Camunda in production?
  2. What’s missing? Are there specific operational insights you need?
  3. Integration needs - What tools do you typically integrate with? (Grafana, PagerDuty, etc.)
  4. Camunda 8 support - Is there interest in adapting this for Camunda 8 clusters?

:link: Repository & Documentation

GitHub: GitHub - bibacrm/camunda-health-monitor: Lightweight monitoring dashboard for Camunda 7 clusters with real-time metrics, JVM health tracking, and database analytics. Built with Flask, Alpine.js, and Tailwind CSS.

The README includes:

  • Complete setup instructions
  • Docker deployment guide
  • API documentation
  • Prometheus metrics export

:page_facing_up: License Note

The project uses an Apache 2.0 license.

:folded_hands: Contributing

This is a community project, and I welcome contributions! Whether it’s:

  • Bug reports and feature requests
  • Pull requests for new metrics or improvements
  • Documentation improvements
  • Testing with different Camunda configurations

Looking forward to your feedback and hoping this helps the community! If you find it useful, a GitHub star would be much appreciated :glowing_star:


Built with :heart: for the Camunda community