Camunda 8.8 - Issue with self managed installation of basic components in AWS EKS cluster

Hi,

I am trying to set up recent camunda 8.8 basic components(Orchestration - Zeebe, Operate, Tasklist) with embedded elasticsearch and connectors without any security ,any ingress and domain in AWS EKS clusters. Just a pure local setup similar to kind cluster setup.

Deploying attached values.yaml bring up the pods and services for camunda-connectors, camunda-elasticsearch, camunda-zeebe and camunda-zeebe-gateway components.

After port forward of camunda-zeebe-gateway to port 8080, opened another terminal and hit the command curl -u demo:demo http://localhost:8080/v2/topology which brought proper results.
{“brokers”:[{“nodeId”:0,“host”:“camunda-zeebe-0.camunda-zeebe”,“port”:26501,“partitions”:[{“partitionId”:1,“role”:“leader”,“health”:“healthy”}],“version”:“8.8.9”}],“clusterSize”:1,“partitionsCount”:1,“replicationFactor”:1,“gatewayVersion”:“8.8.9”,“lastCompletedChangeId”:“-1”}

But when I try to hit Operate/Tasklist, with http://localhost:8080/operate or http://localhost:8080/tasklist, web page does not load.

Zeebe pod logs show below errors multiple times:

[grpc-executor-0] WARN
io.camunda.search.es.clients.ElasticsearchSearchClient - Failed to execute search query
co.elastic.clients.elasticsearch._types.ElasticsearchException: [es/search] failed: [index_not_found_exception] no such index [camunda-user-8.8.0_alias]

[grpc-executor-0] WARN
io.camunda.search.es.clients.ElasticsearchSearchClient - Failed to execute search query
org.elasticsearch.client.ResponseException: method [POST], host [http://camunda-elasticsearch:9200], URI [/camunda-user-8.8.0_alias/search?typed_keys=true], status line [HTTP/1.1 503 Service Unavailable]
{“error”:{“root_cause”:[{“type”:“no_shard_available_action_exception”,“reason”:“[camunda-elasticsearch-master-0][10.229.59.149:9300][indices:data/read/search[phase/query]]”}],“type”:“search_phase_execution_exception”,“reason”:“all shards failed”,“phase”:“query”,“grouped”:true,“failed_shards”:[{“shard”:0,“index”:"camunda-user-8.8.0
",“node”:“HyaF9xB0TRCnaNplDGM9Dw”,“reason”:{“type”:“no_shard_available_action_exception”,“reason”:“[camunda-elasticsearch-master-0][10.229.59.149:9300][indices:data/read/search[phase/query]]”}}]},“status”:503}

Tried with multiple trial and errors based on camunda official documentation and GPT response.

But I could not identify where I am going wrong exactly. Could someone please help on this?

values.yaml (1.7 KB)

Hi @Mahesh,

Based on your error logs and the symptoms you’re describing, this appears to be a common issue with Camunda 8.8 self-managed deployments where Elasticsearch indices are missing or have shard availability problems.

Root Cause Analysis

The errors you’re seeing indicate two main issues:

  1. Missing Index/Alias: index_not_found_exception for camunda-user-8.8.0_alias
  2. Shard Availability: no_shard_available_action_exception preventing data access

In Camunda 8.8, authorization data is stored in the search engine, and missing indices or aliases can block Operate and Tasklist from loading properly.

Diagnostic Steps

First, let’s check your Elasticsearch cluster health and indices. Port-forward your Elasticsearch service and run these commands:

# Check cluster health
curl -X GET "http://localhost:9200/_cluster/health?pretty"

# List all indices
curl -X GET "http://localhost:9200/_cat/indices?v"

# Check for specific Camunda indices
curl -X GET "http://localhost:9200/_cat/indices/camunda-*,operate-*,tasklist-*,zeebe-*?v"

Look for:

  • Cluster status (should not be red)
  • Missing indices for camunda-user-*, operate-*, tasklist-*, zeebe-*
  • Unassigned shards

Common Causes & Solutions

If indices are missing:

  • This often happens when indices are accidentally deleted or never created properly during startup
  • Check your Zeebe exporter logs for any errors during index creation

If indices exist but shards are unavailable:

  • Verify Elasticsearch has sufficient resources (CPU, memory, disk)
  • Check if you’re hitting shard limits
  • In single-node setups, ensure replica count is set to 0

Immediate Troubleshooting

  1. Check Elasticsearch logs for startup errors or resource issues
  2. Review Zeebe broker logs for exporter errors during index creation
  3. Verify your Helm values - can you share the relevant Elasticsearch configuration from your values.yaml?

Potential Quick Fix

If this is a fresh deployment and you don’t mind losing data, you can try:

  1. Delete the Elasticsearch StatefulSet and PVC
  2. Redeploy with proper resource allocation
  3. Ensure Elasticsearch starts healthy before other components

Could you please share:

  1. Output of the diagnostic commands above
  2. Elasticsearch pod logs
  3. The Elasticsearch-related configuration from your values.yaml

This will help pinpoint whether it’s a missing index issue or a shard allocation problem.

References:

These Warn messages are expected when you haven’t deploy a modell or job, I think you can ignore these. Did you provide the necessary permissions to your user via identity? What is the exact error message you are seeing in the browser when opening /operate?

Hi @Cris_Ron

I did not enable identity component in the config yaml. I thought zeebe connecting to elasticsearch should be good enough for Operate/Tasklist to work.

When I hit operate/tasklist, browser just says connection refused.

Identity with embedded keycloak can help in this case? Do you have any suggestion?

Connection refused sounds like an authentication/authorization issue with the Zeebe gateway. There is an option to disable authentication with the gateway, maybe you should try this option, should be findable in the orchestration → broker gateway doku.

As I#m only workign with Entra ID authentication I cannot hep you with Keycloak unfortunately, I think the Camudna doku/gpt can help you with a basic setup.