Operate 8.7.3 - Error occurred while archiving data. Will be retried

Hello in the last days (maybe after upgrading to 8.7.3 from 8.6.x, maybe later - I’m not sure) I constantly see the following errors in Operate log (I run camunda self-managed in kuber with the provided latest helm chart). What does it mean? How can I resolve it?

2025-06-03 21:19:04.609 ERROR 7 --- [     archiver_1] i.c.o.a.AbstractArchiverJob              : Error occurred while archiving data. Will be retried.

java.util.concurrent.CompletionException: java.net.SocketTimeoutException: 30,000 milliseconds timeout on connection http-outgoing-11 [ACTIVE]
        at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:332) ~[?:?]
        at java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:347) ~[?:?]
        at java.base/java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:708) ~[?:?]
        at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510) ~[?:?]
        at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2194) ~[?:?]
        at io.camunda.operate.util.ElasticsearchUtil$DelegatingActionListener.lambda$onFailure$1(ElasticsearchUtil.java:767) ~[operate-schema-8.7.3.jar:8.7.3]
        at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54) ~[spring-context-6.2.6.jar:6.2.6]
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572) ~[?:?]
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317) ~[?:?]
        at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) ~[?:?]
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
        at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]
Caused by: java.net.SocketTimeoutException: 30,000 milliseconds timeout on connection http-outgoing-11 [ACTIVE]
        at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.timeout(HttpAsyncRequestExecutor.java:387) ~[httpcore-nio-4.4.16.jar:4.4.16]
        at org.apache.http.impl.nio.client.InternalIODispatch.onTimeout(InternalIODispatch.java:98) ~[httpasyncclient-4.1.5.jar:4.1.5]
        at org.apache.http.impl.nio.client.InternalIODispatch.onTimeout(InternalIODispatch.java:40) ~[httpasyncclient-4.1.5.jar:4.1.5]
        at org.apache.http.impl.nio.reactor.AbstractIODispatch.timeout(AbstractIODispatch.java:175) ~[httpcore-nio-4.4.16.jar:4.4.16]
        at org.apache.http.impl.nio.reactor.BaseIOReactor.sessionTimedOut(BaseIOReactor.java:261) ~[httpcore-nio-4.4.16.jar:4.4.16]
        at org.apache.http.impl.nio.reactor.AbstractIOReactor.timeoutCheck(AbstractIOReactor.java:506) ~[httpcore-nio-4.4.16.jar:4.4.16]
        at org.apache.http.impl.nio.reactor.BaseIOReactor.validate(BaseIOReactor.java:211) ~[httpcore-nio-4.4.16.jar:4.4.16]
        at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:280) ~[httpcore-nio-4.4.16.jar:4.4.16]
        at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104) ~[httpcore-nio-4.4.16.jar:4.4.16]
        at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591) ~[httpcore-nio-4.4.16.jar:4.4.16]
        ... 1 more

Hi @igormsk - I admit I am not 100% certain from just the information provided, but the archive task within Operate is trying to move data to a dated index within Elasticsearch (or OpenSearch) for archival (docs reference). It seems it is timing out after 30 seconds during that operation.

Can you share your Helm values (with secrets redacted), or at least the global, elastic, and operate sections?

values.yaml (3.3 KB)
@nathan.loding Sorry for the long delay with my reply. Here are the parts of the values.yaml file you asked for.
Is there a way to increase the 30 sec timeout for this operation?

I’m still having this problem in my prod and test environments, even with the latest operate 8.7.6

Hi @igormsk - nothing jumps out at me in your values file, and no one else has jumped in on this thread. Have you opened a support ticket yet? It might be best to open a support ticket for this.

Probably all I need to do is to increase the 30sec timeout for this request from operate to elasticSearch. How can I do it?

I found this How to avoid 30,000ms timeout during reindexing - Elasticsearch - Discuss the Elastic Stack
But how can I change the sockerTimeout set in Operate?

@igormsk - I don’t know if you can; it would be best to open a support ticket for this, I think.