Postgres db queries with distinct

Julio_Parrao · August 10, 2021, 5:39pm

Hello,

I have seen that most of Camunda database queries uses the sentence “select distinct…” to perform searches. For example this one, when I search historical process instances by tenantId (I’m using PostgreSQL 11.12):

select distinct RES.*
from (
SELECT SELF.*, DEF.NAME_, DEF.VERSION_, DEF.DEPLOYMENT_ID_
FROM ACT_HI_PROCINST SELF LEFT JOIN ACT_RE_PROCDEF DEF ON SELF.PROC_DEF_ID_ = DEF.ID_
WHERE ( SELF.TENANT_ID_ in ( ‘aTenantId’ ) )
) RES
order by RES.ID_ asc
LIMIT 20 OFFSET 0

My question is, why Camunda always uses “distinct” for almost all its db queries ?. In the previous example, it makes the query incredibly slower (200ms without distinct vs around 8 seconds with distinct). I have been testing and so far I always got the same results, same quantity and same order with and without “distinct”.

In which scenarios Camunda db queries can return duplicates and make it necessary to use the distinct statement ?

What would happen if we just remove the “distinct” from all queries ?

Thanks !

Artem_Smirnov · August 11, 2021, 5:23am

Why dont try to remove distinct and test performance? Its kinda easy to change mappings files.

Julio_Parrao · August 11, 2021, 2:01pm

Thanks for your answer Artem

I tested it, and the performance is much better without “distinct”. So far we haven’t find issues, but still I would like to know in which scenarios Camunda db queries can return duplicates and make it necessary to use the distinct statement ?

Jean_Robert_Alves · August 11, 2021, 2:11pm

How can i change those queries used by camunda in a springboot project?

Julio_Parrao · August 11, 2021, 9:07pm

Here is information about how to write custom queries: Custom Queries - Camunda

Jean_Robert_Alves · August 12, 2021, 11:59am

Nice post @Julio_Parrao , thanks! But, can i override the default queries used by camunda process engine, like the job acquisition one?

Artem_Smirnov · August 12, 2021, 1:56pm

I just put em here and it overrides default )

Julio_Parrao · August 12, 2021, 2:33pm

Yes you can. I haven’t tested what Artem explained but might work if you use the same package name as Camunda, and looks easier compared to my method.

What I’m doing is overriding this method:

github.com

camunda/camunda-bpm-platform/blob/master/engine/src/main/java/org/camunda/bpm/engine/impl/cfg/ProcessEngineConfigurationImpl.java#L1774

    
      
          
          
    properties.put("collationForCaseSensitivity", DbSqlSessionFactory.databaseSpecificCollationForCaseSensitivity.get(databaseType));
          
          
    Map<String, String> constants = DbSqlSessionFactory.dbSpecificConstants.get(databaseType);
              for (Entry<String, String> entry : constants.entrySet()) {
                properties.put(entry.getKey(), entry.getValue());
              }
            }
          }
          
          
protected InputStream getMyBatisXmlConfigurationSteam() {
            return ReflectUtil.getResourceAsStream(DEFAULT_MYBATIS_MAPPING_FILE);
          }
          
          
// session factories ////////////////////////////////////////////////////////
          
          
protected void initIdentityProviderSessionFactory() {
            if (identityProviderSessionFactory == null) {
              identityProviderSessionFactory = new GenericManagerFactory(DbIdentityServiceProvider.class);
            }
          }

so I can inject my own mybatis mapping file. Then I just copied the mappings.xml file (camunda-bpm-platform/mappings.xml at master · camunda/camunda-bpm-platform · GitHub) to my project and modified it to use my custom mappings files wherever I wanted.