Byterray leak

tair · December 3, 2017, 5:27pm

Hi all!

Today I analyzed act_ge_bytearray table and found that there are a bunch of bytearray variables instances (files, json, etc.) that are not referenced from either act_ru_variable or act_hi_varinst. They occupy nearly 30% of act_ge_bytearray size.

Can this garbage be left at process migration (we use it here)? If so, is this expected behavior or bug?

sdorokhova · December 4, 2017, 8:58am

Hi @tair,
you should also check ACT_HI_DETAIL. It stores all changes of variable values.

tair · December 4, 2017, 12:23pm

Thank you @sdorokhova! That was exactly the reason.

This is the script I use to check for leak in Postgres

with ba AS (
    select id_ as id from act_ge_bytearray
    EXCEPT
    select bytearray_id_ as id from act_ru_variable
    EXCEPT
    select bytearray_id_ as id from act_hi_varinst
    EXCEPT
    select bytearray_id_ as id from act_hi_detail
), bba AS (
  SELECT * from act_ge_bytearray
  where id_ in (select id from ba)
)
select bba.name_, count(*), pg_size_pretty(sum(octet_length(bba.bytes_)))
from bba
where deployment_id_ is null
group by rollup (bba.name_) order by sum(octet_length(bba.bytes_)) desc;

Though I have a suggestion for @camunda team: there is a diagram https://docs.camunda.org/manual/7.8/user-guide/process-engine/img/erd_78_history.svg, it is practically useless, because the relations are not documented and you can’t search through the picture, it would be great at least to have the relations in diagram, even if they are not there in DDL.

nvanbelle · December 4, 2017, 1:22pm

That’s probably because the reference columns are actually not foreign keys but plain text fields.

Once I’ve tried to create a cleanup script for the Camunda database. It’s practically impossible because you’re most likely removing records that are still ‘referenced’ from other tables, resulting in corrupt data and an engine throwing exceptions wildly.

system · January 30, 2024, 11:02am