$ zbctl fail job 2251799827014829 --errorMessage 'manual fail job' --retries=-1
Error: rpc error: code = FailedPrecondition desc = Command 'FAIL' rejected with code 'INVALID_STATE': Expected to fail job with key '2251799827014829', but it is in state 'FAILED'
camunda.operate:
# ELS instance to store Operate data
elasticsearch:
numberOfShards: 3
numberOfReplicas: 1
# Cluster name
clusterName: ce-elasticsearch
# URL
url: http://es.example.com:9200
Are you able to upload Operate log messages, maybe filtered by the flow node instance id, or process instance id? I can imagine that the import/processing of the Zeebe events into Operate has failed.
The flow node instance Id in the Operate screenshot does not match the job Id in the fail command.
You know that when you fail a job, it will be immediately available for reactivation on the broker if it has >0 retries? This means that any worker serving the task type of that job will reactivate it straight away with a new job id.
I have several process instances that always keep “Active” on Operate until I manually click “cancel” button to force cancel them. And workers are healthy during this time.
I cannot find operate logs that contains instance/node id.
In the logs excludes heart beat records such as
2022-07-18 18:18:52.303 [Broker-0-HealthCheckService] [Broker-0-zb-actors-1] INFO
io.camunda.zeebe.broker.system - Partition-1 recovered, marking it as healthy
I see some WARN logs like
2022-07-18 18:23:37.720 [] [grpc-default-worker-ELG-5-2] WARN
io.grpc.netty.NettyServerHandler - Stream Error
io.netty.handler.codec.http2.Http2Exception$StreamException: Stream closed before write could take place
at io.netty.handler.codec.http2.Http2Exception.streamError(Http2Exception.java:172) ~[netty-codec-http2-4.1.74.Final.jar:4.1.74.Final]
at io.netty.handler.codec.http2.DefaultHttp2RemoteFlowController$FlowState.cancel(DefaultHttp2RemoteFlowController.java:481) ~[netty-codec-http2-4.1.74.Final.jar:4.1.74.Final]
at io.netty.handler.codec.http2.DefaultHttp2RemoteFlowController$1.onStreamClosed(DefaultHttp2RemoteFlowController.java:105) ~[netty-codec-http2-4.1.74.Final.jar:4.1.74.Final]
at io.netty.handler.codec.http2.DefaultHttp2Connection.notifyClosed(DefaultHttp2Connection.java:357) ~[netty-codec-http2-4.1.74.Final.jar:4.1.74.Final]
at io.netty.handler.codec.http2.DefaultHttp2Connection$ActiveStreams.removeFromActiveStreams(DefaultHttp2Connection.java:1007) ~[netty-codec-http2-4.1.74.Final.jar:4.1.74.Final]
at io.netty.handler.codec.http2.DefaultHttp2Connection$ActiveStreams.deactivate(DefaultHttp2Connection.java:963) ~[netty-codec-http2-4.1.74.Final.jar:4.1.74.Final]
at io.netty.handler.codec.http2.DefaultHttp2Connection$DefaultStream.close(DefaultHttp2Connection.java:515) ~[netty-codec-http2-4.1.74.Final.jar:4.1.74.Final]
at io.netty.handler.codec.http2.DefaultHttp2Connection$DefaultStream.close(DefaultHttp2Connection.java:521) ~[netty-codec-http2-4.1.74.Final.jar:4.1.74.Final]
at io.netty.handler.codec.http2.Http2ConnectionHandler.closeStream(Http2ConnectionHandler.java:613) ~[netty-codec-http2-4.1.74.Final.jar:4.1.74.Final]
at io.netty.handler.codec.http2.DefaultHttp2ConnectionDecoder$FrameReadListener.onRstStreamRead(DefaultHttp2ConnectionDecoder.java:444) ~[netty-codec-http2-4.1.74.Final.jar:4.1.74.Final]
at io.netty.handler.codec.http2.Http2InboundFrameLogger$1.onRstStreamRead(Http2InboundFrameLogger.java:80) ~[netty-codec-http2-4.1.74.Final.jar:4.1.74.Final]
at io.netty.handler.codec.http2.DefaultHttp2FrameReader.readRstStreamFrame(DefaultHttp2FrameReader.java:509) ~[netty-codec-http2-4.1.74.Final.jar:4.1.74.Final]
at io.netty.handler.codec.http2.DefaultHttp2FrameReader.processPayloadState(DefaultHttp2FrameReader.java:259) ~[netty-codec-http2-4.1.74.Final.jar:4.1.74.Final]
at io.netty.handler.codec.http2.DefaultHttp2FrameReader.readFrame(DefaultHttp2FrameReader.java:159) ~[netty-codec-http2-4.1.74.Final.jar:4.1.74.Final]
at io.netty.handler.codec.http2.Http2InboundFrameLogger.readFrame(Http2InboundFrameLogger.java:41) ~[netty-codec-http2-4.1.74.Final.jar:4.1.74.Final]
at io.netty.handler.codec.http2.DefaultHttp2ConnectionDecoder.decodeFrame(DefaultHttp2ConnectionDecoder.java:173) ~[netty-codec-http2-4.1.74.Final.jar:4.1.74.Final]
at io.netty.handler.codec.http2.Http2ConnectionHandler$FrameDecoder.decode(Http2ConnectionHandler.java:378) ~[netty-codec-http2-4.1.74.Final.jar:4.1.74.Final]
at io.netty.handler.codec.http2.Http2ConnectionHandler.decode(Http2ConnectionHandler.java:438) ~[netty-codec-http2-4.1.74.Final.jar:4.1.74.Final]
at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:510) ~[netty-codec-4.1.74.Final.jar:4.1.74.Final]
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:449) ~[netty-codec-4.1.74.Final.jar:4.1.74.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:279) ~[netty-codec-4.1.74.Final.jar:4.1.74.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[netty-transport-4.1.74.Final.jar:4.1.74.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[netty-transport-4.1.74.Final.jar:4.1.74.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[netty-transport-4.1.74.Final.jar:4.1.74.Final]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) ~[netty-transport-4.1.74.Final.jar:4.1.74.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[netty-transport-4.1.74.Final.jar:4.1.74.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[netty-transport-4.1.74.Final.jar:4.1.74.Final]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) ~[netty-transport-4.1.74.Final.jar:4.1.74.Final]
at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:795) ~[netty-transport-classes-epoll-4.1.74.Final.jar:4.1.74.Final]
at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:480) ~[netty-transport-classes-epoll-4.1.74.Final.jar:4.1.74.Final]
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378) ~[netty-transport-classes-epoll-4.1.74.Final.jar:4.1.74.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986) ~[netty-common-4.1.74.Final.jar:4.1.74.Final]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[netty-common-4.1.74.Final.jar:4.1.74.Final]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[netty-common-4.1.74.Final.jar:4.1.74.Final]
at java.lang.Thread.run(Unknown Source) ~[?:?]
2022-07-18 18:23:37.721 [] [grpc-default-worker-ELG-5-2] WARN
2022-07-18 03:32:47.010 [] [grpc-default-worker-ELG-5-3] INFO
io.grpc.netty.NettyServerTransport.connections - Transport failed
io.netty.handler.codec.http2.Http2Exception: HTTP/2 client preface string missing or corrupt. Hex dump for received bytes: 16030100f3010000ef03039dd5e0b87f25619d3e29dd7d50
at io.netty.handler.codec.http2.Http2Exception.connectionError(Http2Exception.java:108) ~[netty-codec-http2-4.1.74.Final.jar:4.1.74.Final]
at io.netty.handler.codec.http2.Http2ConnectionHandler$PrefaceDecoder.readClientPrefaceString(Http2ConnectionHandler.java:306) ~[netty-codec-http2-4.1.74.Final.jar:4.1.74.Final]
at io.netty.handler.codec.http2.Http2ConnectionHandler$PrefaceDecoder.decode(Http2ConnectionHandler.java:239) ~[netty-codec-http2-4.1.74.Final.jar:4.1.74.Final]
at io.netty.handler.codec.http2.Http2ConnectionHandler.decode(Http2ConnectionHandler.java:438) ~[netty-codec-http2-4.1.74.Final.jar:4.1.74.Final]
at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:510) ~[netty-codec-4.1.74.Final.jar:4.1.74.Final]
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:449) ~[netty-codec-4.1.74.Final.jar:4.1.74.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:279) ~[netty-codec-4.1.74.Final.jar:4.1.74.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[netty-transport-4.1.74.Final.jar:4.1.74.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[netty-transport-4.1.74.Final.jar:4.1.74.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[netty-transport-4.1.74.Final.jar:4.1.74.Final]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) ~[netty-transport-4.1.74.Final.jar:4.1.74.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[netty-transport-4.1.74.Final.jar:4.1.74.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[netty-transport-4.1.74.Final.jar:4.1.74.Final]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) ~[netty-transport-4.1.74.Final.jar:4.1.74.Final]
at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:795) ~[netty-transport-classes-epoll-4.1.74.Final.jar:4.1.74.Final]
at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:480) ~[netty-transport-classes-epoll-4.1.74.Final.jar:4.1.74.Final]
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378) ~[netty-transport-classes-epoll-4.1.74.Final.jar:4.1.74.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986) ~[netty-common-4.1.74.Final.jar:4.1.74.Final]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[netty-common-4.1.74.Final.jar:4.1.74.Final]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[netty-common-4.1.74.Final.jar:4.1.74.Final]
at java.lang.Thread.run(Unknown Source) ~[?:?]
And I’ve found that, all pending “Active” process instances shows on Operate is actually failed by “FailJobCommand” by my workers.
I can directly access the incident data from ES shows that
"incidents": [
{
"errorMessage": "Failed to send activated jobs to client",
"errorType": "No more retries left",
"flowNodeId": "90iel4msulo5jjb2nvg",
"uid": "90iel4msulo5jjb2nvg",
"creationTime": "2022-07-04T07:56:37.538+0000"
}
]