Cross posting the corresponding issue on GitHub.
We spend now several days with help of @VonDerBeck . We cannot narrow down if it is an infrastructure issue, or Camunda, or Keycloack, or KeycloakPlugin or if we somehow have an unusual LDAP structure.
Suspicious
Things we find curious:
[Camunda] Big group-query
KeycloackPlugin or Camunda trigger a huge group-query when opening a task list, querying basically every group that is available in this log line:
2021-07-21 12:51:16.970 DEBUG 9 --- [nio-8080-exec-8] org.camunda.bpm.extension.keycloak : KEYCLOAK-01050 Keycloak group query results:
The query is huge, because we have roughly 800 groups.
We cannot imagine, why all have to be queried, even though there are only 2 tasks available
[Keycloak] Querying all users via REST is slow, too
Using the camunda client id for getting all users from Keycloak REST-API is very slow, too. The queries are infact slow. But after days we cannot figure out what configuration is responsible or at least uncommon.
Tweaks
In case somebody runs in to a similar problem. those action showed improvements:
[Camunda] Enhance Hakari config
This delays the crashes since more db connections might timeout eventually
Camunda seems to keep an open db transaction for the duration of its keycloak-queries. When queries are slow the db connections block the complete database. Adding the following to your default.yml
or production.yml
helps:
spring.datasource:
# see https://github.com/brettwooldridge/HikariCP#gear-configuration-knobs-baby
hikari:
connectionTimeout: 30000
idleTimeout: 600000
maxLifetime: 900000
#Hikari 4.0
keepaliveTime: 60000
# see https://github.com/brettwooldridge/HikariCP#gear-configuration-knobs-baby (minimumIdle)
# and https://github.com/brettwooldridge/HikariCP/wiki/About-Pool-Sizing
minimumIdle: 10
maximumPoolSize: 20
[Camunda] Keycloak Plugin>=2.1.1 required
This improves performance of UI view “Tasklist” and "Admin → List all Users"
In our experiments we always required the properties. Because caching is only available in version 2.1.1 onwards, you do require the current snapshot in order to set cacheEnabled
:
# Camunda Keycloak Identity Provider Plugin
plugin.identity.keycloak:
cacheEnabled: true
authorizationCheckEnabled: false
At date of this posting, the version 2.1.1 is only availabe as snapshot.
[Keycloak] Limit amount of users in User Federation
This improves performance of UI view “Tasklist” and "Admin → List all Users"
I left the company realm and created a realm specific for Camunda. I narrowed down the LDAP Tree which will be synced to only people who might possibly work with Cmaunda. Now I have only 125 users and 440 groups (groups have recurions). That improved query performance.
[Keycloak] More performance for Keycloak DB 🤷
We increased cluster resources for Keycloak Postgres DB to 1 cpu and 2 Gib . But we did not witness any effect.