PostgreSQL performance: UUID vs. VARCHAR

ancoron · June 21, 2017, 9:43pm

I totally agree with you on the semantic versioning part. Nevertheless, this could be an option for a version 8.0, no?

I also recently tried to delete individual (obsolete) process definitions including their historic data using the cockpit UI and/or the ReST API only to run into timeouts at the JDBC level (which I configured because I don’t want any thread to get stuck for more than 5 minutes.

Then I wrote myself a simple SQL history cleanup script, executed by a Jenkins with timestamp logging enabled. An excerpt from a first execution (basically carried out before the upgrade to 7.7):

2017-06-21T01:54:42+0200 CREATE MATERIALIZED VIEW v_cleanup_procinst AS (
2017-06-21T01:54:42+0200     SELECT
2017-06-21T01:54:42+0200         id_
2017-06-21T01:54:42+0200     FROM
2017-06-21T01:54:42+0200         act_hi_procinst
2017-06-21T01:54:42+0200     WHERE
2017-06-21T01:54:42+0200         end_time_ < (CURRENT_TIMESTAMP - '6 months'::interval)
2017-06-21T01:54:42+0200 );
2017-06-21T01:54:47+0200 SELECT 634546
...
2017-06-21T02:33:55+0200 DELETE FROM
2017-06-21T02:33:55+0200     act_hi_actinst
2017-06-21T02:33:55+0200 WHERE
2017-06-21T02:33:55+0200     proc_inst_id_ IN (
2017-06-21T02:33:55+0200         SELECT
2017-06-21T02:33:55+0200             id_
2017-06-21T02:33:55+0200         FROM
2017-06-21T02:33:55+0200             v_cleanup_procinst
2017-06-21T02:33:55+0200     )
2017-06-21T02:33:55+0200 ;
2017-06-21T03:49:50+0200 DELETE 9040134

Deleting ~9M rows took more than 1h 15m!!!

Of course, there are ways to optimize that in PostgreSQL (and the system is a pretty slow one in terms of I/O) but it shows the underlying problem quite good.

The new history TTL comes to rescue here. Really nice to see that it also has accounted for the basic business critical aspects like the batch window and batch size.

Nevertheless, I somehow get the impression that the DDL has a lot of room for improvement (at least for PostgreSQL).

Cheers,

ancoron