Hi, I am having trouble with camanda-operate in my K3s cluster, any help would help as I am getting quite desperate at the moment.
Here are some information:
I have K3s cluster on a server with helm. I am installing camanda self managed platform using helm charts. I am using default values provided with only small changes to resources and number of replicas. Elastic search is up and runnning and I can access it and use it as needed. Zeebe, ZeebeGateway, Connectors pods are also up and running, but Operate pod is still failing due to failing Shards in elastic search:
2024-05-20 20:36:36.028 WARN 7 --- [ main] o.e.c.RestClient : request [POST http://camunda-elasticsearch:9200/operate-migration-steps-repository-1.1.0_/_search?typed_keys=true&max_concurrent_shard_requests=5&ignore_unavailable=true&expand_wildcards=open&allow_no_indices=true&ignore_throttled=false&search_type=query_then_fetch&batched_reduce_size=512] returned 1 warnings: [299 Elasticsearch-8.12.2-48a287ab9497e852de30327444b0809e55d46466 "[ignore_throttled] parameter is deprecated because frozen indices have been deprecated. Consider cold or frozen tiers in place of frozen indices."]
2024-05-20 20:36:36.029 WARN 7 --- [ main] i.c.o.u.RetryOperation : Retry Operation Count search results failed: Elasticsearch exception [type=search_phase_execution_exception, reason=all shards failed]
When I look into elastric search pod, I see this:
Caused by: org.elasticsearch.action.NoShardAvailableActionException
at org.elasticsearch.server@8.12.2/org.elasticsearch.action.NoShardAvailableActionException.forOnShardFailureWrapper(NoShardAvailableActionException.java:28)
at org.elasticsearch.server@8.12.2/org.elasticsearch.action.search.AbstractSearchAsyncAction.onShardFailure(AbstractSearchAsyncAction.java:529)
at org.elasticsearch.server@8.12.2/org.elasticsearch.action.search.AbstractSearchAsyncAction.onShardFailure(AbstractSearchAsyncAction.java:476)
_cluster/health:
{
"cluster_name": "elastic",
"status": "red",
"timed_out": false,
"number_of_nodes": 2,
"number_of_data_nodes": 0,
"active_primary_shards": 0,
"active_shards": 0,
"relocating_shards": 0,
"initializing_shards": 0,
"unassigned_shards": 22,
"delayed_unassigned_shards": 0,
"number_of_pending_tasks": 0,
"number_of_in_flight_fetch": 0,
"task_max_waiting_in_queue_millis": 0,
"active_shards_percent_as_number": 0.0
}
_cat/shards?v:
index shard prirep state docs store dataset ip node
operate-decision-requirements-8.3.0_ 0 p UNASSIGNED
operate-metric-8.3.0_ 0 p UNASSIGNED
operate-operation-8.4.0_ 0 p UNASSIGNED
operate-list-view-8.3.0_ 0 p UNASSIGNED
operate-user-1.2.0_ 0 p UNASSIGNED
operate-web-session-1.1.0_ 0 p UNASSIGNED
operate-decision-instance-8.3.0_ 0 p UNASSIGNED
operate-process-8.3.0_ 0 p UNASSIGNED
operate-decision-8.3.0_ 0 p UNASSIGNED
.ds-.logs-deprecation.elasticsearch-default-2024.05.20-000001 0 p UNASSIGNED
operate-flownode-instance-8.3.1_ 0 p UNASSIGNED
operate-post-importer-queue-8.3.0_ 0 p UNASSIGNED
operate-import-position-8.3.0_ 0 p UNASSIGNED
operate-event-8.3.0_ 0 p UNASSIGNED
operate-incident-8.3.1_ 0 p UNASSIGNED
operate-batch-operation-1.0.0_ 0 p UNASSIGNED
operate-sequence-flow-8.3.0_ 0 p UNASSIGNED
.ds-ilm-history-5-2024.05.20-000001 0 p UNASSIGNED
operate-migration-steps-repository-1.1.0_ 0 p UNASSIGNED
operate-user-task-8.5.0_ 0 p UNASSIGNED
operate-variable-8.3.0_ 0 p UNASSIGNED
operate-message-8.5.0_ 0 p UNASSIGNED
_cluster/allocation/explain?pretty:
{
"note": "No shard was specified in the explain API request, so this response explains a randomly chosen unassigned shard. There may be other unassigned shards in this cluster which cannot be assigned for different reasons. It may not be possible to assign this shard until one of the other shards is assigned correctly. To explain the allocation of other shards (whether assigned or unassigned) you must specify the target shard in the request to this API.",
"index": "operate-decision-requirements-8.3.0_",
"shard": 0,
"primary": true,
"current_state": "unassigned",
"unassigned_info": {
"reason": "CLUSTER_RECOVERED",
"at": "2024-05-20T19:49:30.119Z",
"last_allocation_status": "no"
},
"can_allocate": "no",
"allocate_explanation": "Elasticsearch isn't allowed to allocate this shard to any of the nodes in the cluster. Choose a node to which you expect this shard to be allocated, find this node in the node-by-node explanation, and address the reasons which prevent Elasticsearch from allocating this shard there."
}
I have no more information and it seems like there are no answers here on this topic, is there something I can do about it?
Thank you very much
Vojtech