Deserialize json from database (postgresql)

Benjamin_Auffarth · April 13, 2017, 2:43pm

Hi everyone,
I am storing json (org.camunda.bpm.engine.impl.util.json.JSONObject) during my process instance run to the database (postgres). I am struggling to find out how to deserialize these jsons, and how this works, and I’ve spent quite a while on this already. The json variables are listed in the act_hi_varinst and act_hi_detail. Entries from both tables refer to act_ge_bytearray, however I can’t seem to get these cast back to JSONObject. I am trying as follows below, but I only get empty json strings {}.
Help appreciated. Thanks!

import org.camunda.bpm.engine.impl.util.json.JSONObject;

import java.io.IOException;
import java.io.InputStream;
import java.sql.*;

public class test {
    public static String substring(String str) {
                                                     return str.substring(0, Math.min(str.length(), 5));
                                                                                                        }

    public static void main(String[] args) throws SQLException, IOException, ClassNotFoundException {
        Connection conn = DriverManager.getConnection("jdbc:postgresql://camunda-cockpit.company.com/process-engine?user=username&password=password");
        Statement stmt = conn.createStatement();
        ResultSet rs = stmt.executeQuery("select v.bytes_ from act_hi_detail ad join act_ge_bytearray v on ad.bytearray_id_ = v.id_ where ad.proc_def_id_='process-waterfall:93:b95f181e-1f6e-11e7-99c3-363063316461' and ad.name_ = 'jsonvariable';");

        while (rs.next()) {
            InputStream is = rs.getBinaryStream("bytes_");
            JSONObject jo = new JSONObject(is);
            System.out.println(jo.toString());
        }

        rs.close();
        stmt.close();
        conn.close();

    }
}

garysamuelson · April 14, 2017, 8:48am

Though this might be considered off-topic given this is a question more about JSON object marshalling… as opposed to Camunda’s process engine.

Unless, is your question with regards to “process variables” (as in using the Spin library)?

This sounds like you need to add into your project:

JAX-RS
JPA

The only pointer I have is somewhat dated. Though it’s demonstrating OSGi integration, you can skip past the OSGi details and instead build for a typical dynamic web-app archetype (with JPA).

see: OSGi and JPA Deliver Business Objects to BPM

ancoron · April 23, 2017, 7:38pm

Hi, I have wrapped my head around a similar thing myself. The solution using Spin should work for all supported databases but as I am also using PostgreSQL I want that the database actually does decide how objects should be stored (as it is an object-relational database). JPA is not an option for me because it has major issues when in clustered mode and just overhead (CPU/memory/footprint/maintenance/troubleshooting/…). However, I still didn’t find any documentation on how to provide and override the engines de-/serialization strategy and classes.

Specifically I am about to implement a custom de-/serializer for JSON data mapped to the PostgreSQL JSONB data type, which also could pick up Map and array/List/Set values, making them readable (and filter) directly from the database.

Thinking even further I am about to do an experiment with avoiding multiple variables per process instance but instead have only one JSON object be inserted/updated, which would be larger in size, but only one database call per transaction and ideally just one (or at least sequential) I/O page to read/write.

However, for that I don’t yet see an appropriate entry-point to hook into the engine.

Benjamin_Auffarth · April 24, 2017, 10:25am

Hey Ancorcon,
I changed json handling in our processes to use spin to get around the main problem of not being able to store/retrieve the data, but I like your idea of storing data directly as jsonb in postgres. There are a few examples of custom serializers around on github (example: https://github.com/ThorbenLindhauer/camunda-variable-serializers/blob/master/src/main/java/org/camunda/bpm/variables/processinstance/ProcessInstanceVariableSerializer.java). You’d need to register your serializer: https://github.com/camunda/camunda-bpm-platform/blob/master/engine-plugins/spin-plugin/src/main/java/org/camunda/spin/plugin/impl/SpinProcessEnginePlugin.java#L59-L64.

Cheers!
Ben.

ancoron · April 25, 2017, 6:50am

Hi @Benjamin_Auffarth. Thanx for the pointers. However, even with Spin there is no infrastructure for customizing the way values are being stored in and retrieved from the database, which is my main problem atm.

It looks like there is no way around extending the core classes for variable handling (around ValueFields) as well as the SQL templates for MyBatis directly in the engine. As I am using OSGi, I can provide patched versions at runtime, but it still feels wrong.

thorben · April 25, 2017, 7:14am

Since what you are planning to do is not a Camunda feature and is not a simple thing, there is no right way to do it

ancoron · April 25, 2017, 8:42am

@thorben I totally agree to that. As you are already here, can you recommend any entry-point for such an extension?

The use-cases for dealing with JSON will be more common and more (supported) databases will be having native JSON data type support (e.g. PostgreSQL, MySQL, …), although major commercial players like Oracle, DB2 and SQLServer are pretty slow at evolving and only provide helper functions so far.

thorben · April 25, 2017, 9:14am

You would need a custom implementation of VariableSerializer. The harder part will be extending the MyBatis SQL mappings and the Java classes that map to the database rows. Not sure if there is a non-invasive way to do that (i.e. without patching the engine itself).

edit: What I meant with my last comment. Don’t hesitate to start hacking. Feel free to ask questions if you hit any roadblocks. It’s easier for us to ask concrete questions on a solution idea you came up with than developing the solution idea for you.

ancoron · April 25, 2017, 9:34am

OK, that sounds like a plan. I’ll go ahead and start hacking…

ancoron · May 10, 2017, 7:48pm

I have done a very rough first implementation of JSON data stored directly into a JSONB column in PostgreSQL and into an unlimited TEXT (or CLOB) column otherwise. This way the JSON data is still readable at the database level.

However, the actual implementation to make this happen is a mess. The following steps had to be done in order to make this happen:

extend the ValueFields interface and all implementations to recognize a new field (currently of type Object as I didn’t decide on what is best to be transported between the layers, but using String for the time being)
implement a new PrimitiveValue and PrimitiveValueTypeImpl - I did that directly in the engine as I would have need to touch the separately released variables project otherwise
implement a custom serializer that extends PrimitiveValueSerializer<NewValue> and put it before any possibly conflicting serializers in the engine configuration
adapt the SQL for table creation for each database type
adapt the MyBatis mappings - here, we need to split the PostgreSQL variable mappings (don’t forget the history) from the standard mappings, which also means we need to modify the DbSqlSessionFactory to recognize the custom mappings or rely on the MyBatis magic to recognize them using the databaseId attribute

There are some details that I missed when just looking at the compiler (e.g. the DefaultHistoryEventProducer) and only got pointed at it after writing/extending some test cases.

As it turns out, this is possible but with very mixed feelings. Especially the separation of the variables stuff into its own project is questionable for me as there is no backwards or forwards compatibility. With every extension one would have to keep both projects (the variables and the engine) in sync, which makes it hard to try something without opening Pandora’s Box.

But I’ll continue a little bit on it to see if I can get that thing at least a bit more elegant and detached from the actual engine code (which means implementing extension points into the engine to be able to do what I want externally).