We get the error message shown below: io.camunda.zeebe.exporter.ElasticsearchExporterException: Failed to flush bulk request: [Failed to flush item(s) of bulk request [type: validation_exception, reason: Validation Failed: 1: this action would add [1] shards, but this cluster currently has [1000]/[1000] maximum normal shards open;]]
The problem was solved temporarily by increasing the number of shards through the elasticsearch API.
The question is:
How should we think when going forward, so that we don’t run into the same problem again for both Helm and other deployment types? Are there any other settings that we should check? How should we set the shard-value before the deployment with Helm?
My understanding is the following: The different components of Camunda create new shards daily, which can result in many shards and may even exceed the limit of 1000.
There are different parameters that you may want to consider:
Zeebe has a parameter numberOfShards which you can configure via an environment variable. Its default value is 3 (thus, 3 new shards are created daily by Zeebe).
Camunda 8 has a retention policy to configure when shards should be deleted. In the self-managed setup, this retention policy is not activated by default. Within the values file, you can set the parameter retentionPolicy.enabled to true. Without any additional configuration, Zeebe’s shards will be deleted daily. The shards for operate and Tasklist will be deleted after 30 days.
I don’t consider myself an expert on this topic and invite others to comment as well.
I have the same problem with a dev machine deployed with docker-compose.
How can the retentionPolicy configured with docker-compose env variables? I don’t find any hints in the docs.
It’s true this is currently not part of the documentation.
You should be able to activate the retention policy via the environment variable RETENTIONPOLICY_ENABLED. Set the value to true.
thanks for your quick answer!
But I’m not sure on which containers I need to configure the retention policy? I put it on zeebe, operate and tasklist. Is this correct? I also configured RETENTIONPOLICY_SCHEDULE but I do not see any log message and there are still 1000 shards …
Do I need another docker container? bitnami/elasticsearch-curator
Hi,
I just double-checked, and my assumption was wrong.
This is not a property of the components. Instead, Elasticsearches currator will be configured to clean up. I think there is no equivalent in the docker compose files.
The recent versions of Camunda 8 use the Elasticsearchs livecycle management. The curator is no longer needed (e.g., in ES 8.X). You can also drop dated indices manually without affected running instances.
Depending on your setup and volume, you may still need a higher number of shards than the default configuration (1000).