Zeebe indexes not created in OpenSearch

Hi,

I am trying to understand how Zeebe in Camunda 8 creates indexes in OpenSearch. Some of the deployments in our setup are complaining:

Operate
[Exception occurred for alias [zeebe-record-decision-requirements], while obtaining next Zeebe records batch: Failed to search index: [zeebe-record-decision-requirements]! Reason: Read timed out]

Tasklist
[Exception occurred for alias [zeebe-record-form], while obtaining next Zeebe records batch: Read timed out]

There are no zeebe-* indexes / aliases created in OpenSearch. These are specifically missing:

  • zeebe-record-decision-requirements
  • zeebe-record-incident
  • zeebe-record-process
  • zeebe-record-user-task
  • zeebe-record-variable

In OpenSearch I see that a multitude of indexes are created for the other services:

  • tasklist-*
  • optimize-*
  • operate-*

Question 1
I have read the 8.6 documentation about Exporters, but it is unclear if this is a required step for a vanilla setup of Camunda, or if it is only for custom exporters?

Question 2
If exporters are not required, is there a typical reason / a place to look, to understand why the indexes are not created? The connection and access rules from Camunda to OpenSearch seems to be OK, as the other indexes are created.

Our environment
We export templates from Helm, that in turn is installed into EKS using Flux. EKS runs with istio with quite strict rules. OpenSearch runs in AWS. Camunda authenticates using aws.enabled=true, not basic auth.

Still investigating this.

Inside the Zeebe-container I find this configuration:

cat application.yaml
zeebe:
  broker:
    exporters:
      opensearch:
        className: "io.camunda.zeebe.exporter.opensearch.OpensearchExporter"
        args:
          url: "https://open-search.account-123.aws.domain.com:443"
          aws:
            enabled: true
    gateway:
      enable: true
      network:
        port: 26500
      security:
        enabled: false
        authentication:
          mode: none
    network:
      host: 0.0.0.0
      commandApi:
        port: 26501
      internalApi:
        port: 26502
      monitoringApi:
        port: "8081"
    cluster:
      clusterSize: "3"
      replicationFactor: "3"
      partitionsCount: "3"
      clusterName: camunda-zeebe
    threads:
      cpuThreadCount: "3"
      ioThreadCount: "3"

I am able to reach the open search port using curl, returning Unauthorized, as expected, as I did not apply any signed tokens.

Zeebe Pod Environment
I have these env variables, with a token with the expected identity from EKS:

AWS_ROLE_ARN=arn:aws:iam::123:role/a/service-role/b-dev-oidc-open-search-role
AWS_WEB_IDENTITY_TOKEN_FILE=/var/run/secrets/iam/service_token
AWS_REGION=eu-north-1

The token is presumably exchanged with AWS STS in order to get Temporary Security Credentials used against OpenSearch.

Zeebe logs

There are some warnings related to RaftServer, no errors.

Every minute after startup, from each pod:
DEBUG: Current exporter state {opensearch={position=-1, metadata=}}
My understanding of position=-1 is that that there where no change done to an index?

During startup:

DEBUG: Exporter configured with 
OpensearchExporterConfiguration{
    url='https://open-search.account-123.aws.domain.com:443', 
    index=IndexConfiguration{
        indexPrefix='zeebe-record', 
        createTemplate=true, 
        command=false, 
        event=true, 
        rejection=false, 
        error=true, 
        deployment=true, 
        process=true, 
        incident=true, 
        job=true, 
        message=true, 
        messageBatch=false, 
        messageSubscription=true, 
        variable=true, 
        variableDocument=true, 
        processInstance=false, 
        processInstanceBatch=true, 
        processInstanceCreation=true, 
        processInstanceMigration=true, 
        processInstanceModification=true, 
        processMessageSubscription=true, 
        decisionRequirements=true, 
        decision=true, 
        decisionEvaluation=true, 
        checkpoint=false, 
        timer=true, 
        messageStartEventSubscription=true, 
        processEvent=false, 
        deploymentDistribution=true, 
        escalation=true, 
        signal=true, 
        signalSubscription=true, 
        resourceDeletion=true, 
        recordDistribution=true, 
        form=true, 
        userTask=true, 
        compensationSubscription=true
    }, 
    bulk=BulkConfiguration{
        delay=5, 
        size=1000, 
        memoryLimit=10485760
    }, 
    aws=AwsConfiguration{
        serviceName=es, 
        region=eu-north-1
    }, 
    retention=RetentionConfiguration{
        isEnabled=false, 
        minimumAge='30d, 
        policyName='zeebe-record-retention-policy, 
        policyDescription='Zeebe record retention policy'
    }
}

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.