Pattern Feedback: Timer Cycle Checking of external systems State/Status

StephenOTT · November 1, 2016, 12:32am

Working on a few patterns for checking the state/status of a external system on a interval.

Consider the following:

I was thinking that the new Conditional Events in Camunda make this a simple process to model. This is only modelled, not tested at the moment.

A “parent” process sends a message to this process saying that a item has been shipped. The process then waits for the “State” variable to have the value Item Received.
Then there is a Event Sub-process that has a non-interrupting timer that executes every N minutes, and then checks the state of the item’s delivery by calling a external API.
When the item is delivered, the State variable is set to “Item Received” from the event sub-process, which moves the process to the End Event Message, which sends a message back to the “creator process”.

Anyone have thoughts on this pattern?

A common usage i was working through is when you have shipped a item and want to notify the receiver about the delivery, but the system that tracks the delivery is ‘legacy’ and can only be polled.

Another factor I was considering is a overall timeout for the process, where have a total of N number of attempts of the Event Sub-process, the overall process would terminate so you do not have endless checks.

Thoughts?

Niall · November 1, 2016, 8:09am

That seems like a nice way of approaching the problem.
To solve the “overall timeout” feature you could include a second event sub-process but make it interrupting.

StephenOTT · November 1, 2016, 1:45pm

@Niall, wouldn’t this create the possibility of a race event if the bpmn builder does not take into account the Timer Cycle and the timeout timer?

I was thinking something like a counter that when reached the defined max count would end the process.

Something like this (Where the timeout period would be Max Checks times Every N Minutes):

Thoughts?

Niall · November 1, 2016, 2:26pm

There wouldn’t be a problem at all with 2 timers - This would be thanks to the way the engine executes timers.
When a timer is activated it is given a date, after which it can be executed so if two timers have the same date in the future both will be executed although the order may be unpredictable. Wither way the interrupting timer event would still do the job.

StephenOTT · November 1, 2016, 2:34pm

Would there not be the possibility (in this very simple BPMN it would likely be next to impossible, but if we started adding additional activities into the bpmn that increased the time it takes to get to the cycle timer’s end event), that while the cycle timer is being executed, the timeout timer executes as a interrupting timer and essentially kill the possible active event sub-process that is checking is the item has been received.

(Again likely not a concern in this scenario)

This problem could easily be mitigated by always ensuring that there is buffer between when your timeout executes and when the last timer cycle occured.

Niall · November 1, 2016, 2:40pm

Well because each instance is single threaded - once the non-interrupting timer is triggered the thread wont pickup the interrupting timer until it reaches a wait state so (provided it’s just running java classes and scripts) it can’t interrupt the event sub-process.

You could always change the “state == timeout” event to a timer… that would make a lot of sense.

StephenOTT · November 1, 2016, 2:45pm

Right! good points! Simpler.

Webcyberrob · November 2, 2016, 5:57am

Hi Stephen,

Interesting pattern. As a further abstraction, perhaps consider the following. Lets assume that 90% of systems require polling to determine business object state changes, however some in the minority may be designed with hooks to enable notification.

Hence would it be a better pattern to use an inbound message to carry a state change notification as this is the lowest common denominator… Thus for those systems which do active notification, integrate an event mechanism. For those which dont, implement a helper process which is timer based and when it detects a state change, synthesize and inject an invent into the core process…

Just a thought,

regards

Rob

StephenOTT · November 3, 2016, 4:06pm

Hi Rob!

I was thinking through the same scenarios. I have been thinking through a few conditions:

Where do you determine if polling is required? If the polling is determined at the parent process level, then I think that the model above still works.
If we want to determine if polling is required at the level as shown in the models above, could look at something like (just a first pass/idea):

In this example I just used a single BPMN file for simplicity, but in practice would probably be split across multiple files.

The top process would be called by some parent process that “shipped” (or did whatever action) an item. The process would determine if polling or hook is required. If it is a hook then it waits for a message from X system that would be broadcasting the message that the item has been delivered.

If it was poll then it send a message to the second/bottom process to start the Poll process previously established.

I was thinking of this design for modularity purposes. Where it is no guarantee that the same Polling or Hook processed would be used for every “Item”, and so you could easily expand the process for support multiple systems or scenarios based on the business rule that determines what to do. And of course you could have the DMN just be data being injected from the parent process.

Thoughts?

garysamuelson · November 7, 2016, 12:04pm

Thanks for sharing these models - discussing these sorts of patterns adds considerable value to the forum. Less technology and more process!

I hope you don’t mind me adding to this topic.

Couple of points -

Why not model the collaboration?
Since we’re looking at, in my opinion, process collaboration… why not model it that way? And, with regards to polling, I’ve run into a few issues with this pattern inside a process instance. This has more to do with implementing the concept of a polling service inside, or encapsulated within, a process model.

Process type, instances, and life-cycle
We need to take the current model and add… (before refactoring with collaboration in mind) yet another perspective. This new perspective takes into account process instance life-cycles.

When viewing the process model as a type and the in-flight process instance as an object representing an instantiated model definition… well, there’s some additional behavior. Empirically, we end up with active timers emitting events within zero-or-more (0…*) process instances (objects). And, in acknowledging the relationship between process-instances and their version type-definitions (the SDLC side of BPM), we correlate the timer-event to process instance per its type (versioned) definition. This is where things get interesting.

Given we have a timers now living within the process instance (referring to ‘polling’), we have a fixed relationship between type (model) and instance. We can’t change the type, with regards to the polling model, without requiring a migration to the new definition! Workaround is to decouple event emitters (focus is polling model) from the process itself. In-other-words, we don’t mix event-generation into our process-type definition. Though we continue logically modeling the embedded timer, as we work towards the executable version we cut over to a more formal pattern for event management.

Referring back to the process life-cycle view, imagine trying to debug your process while a bunch of in-flight process timers are emitting and kicking off in-flight (process instance or object) behaviors… It’s madness. The solution is to migrate the timer execution requirements to something capable of direct management and independent life-cycle.

Cutting this narrative short - my workaround was to introduce an independent timer sub-system. The logical BPMN model remains, because that best expressed our intent. But, the timer implementation, as its own sub-system, provided both direct control (platform-specific configuration) and direct management (i.e. JMX).

StephenOTT · November 7, 2016, 3:46pm

@garysamuelson thanks for the details! I originally did not model it as a collaboration because in practice (at least in our models/world), the polling process would be called from many different parent processes. So an abstraction allows lots of freedom from our side.

Building on your point about the external timer.
If we look at solely the “Item has been shipped” process that has the Event Sub-Process, if the overall timeout timer is set as a date in a process variable rather than a cycle or calculation of Now+N, and the Event Sub-Process is a fairly short time to live scenario, why do you see issues with migration?

If updates are needed to the specific polling task, a migration could occur that would update the parent, the timeout process variable would be migrated, and the event-sub-processes that are still active could either be cancelled or let to end. Post-migration the next execution of the event sub-process would use the proper updated/migrated configuration.

I can imagine that if you had a very long running or complex event sub-process, migrations may be more of a concern.

@garysamuelson can you provide a model that demonstrates the problem you describe?

garysamuelson · November 8, 2016, 12:11pm

In looking specifically at the “every N Minutes” timer implementation (but this can apply to other technical-implementation details regarding time-management).

The scenario(s):

Require timer intervals per SDLC platform (i.e. dev, SIT, QA, Prod). In other words, I need some flexibility in my “start event timer” so that its interval is set per host - via property file (for example).
Require the ability to directly, and discretely, control the timer-service. For example, I want to use a JMX console (i.e. HawtIO) to: start, stop, pause, (etc). My goal is direct management the event service while not interfering with the BPM engine itself. This scenario becomes serious if we experience a partial failure in the underlying SOA stack. Rather than back-logging (piling up events) in my BPM error handler (etc.), I want to simply pause the event-emitter as it specifically applies to the in-flight process instances. I don’t want to shut-down the BPM engine itself… Just put a “hold” on the service effecting the BPM-polling receivers (listeners that react to timer-initiated events).

Want to reiterate that this is an implementation view. For example, I may want to mix in advanced JMS services to help with: load-balancing, wide-area distribution, wide-area business-transaction capabilities…

Webcyberrob · November 11, 2016, 4:08am

Hi Stephen,

What I was thinking was a core event driven process as a pattern. Then use a separate helper process if you need to poll, or those systems which are intrinsically event driven may be able to initiate an event themselves. Thus as per simplified model shown below…

regards

Rob

garysamuelson · November 16, 2016, 10:38am

Wanted to follow-up on this topic. Since we’re taking advantage of Camunda’s ability to both model and execute collaborating flows… And, my reasoning behind the method of abstracting “function” or “service” out of a logical process model while heading towards its executable version.

This may seam a little pedantic though:

Reason for abstracting “system function” from an otherwise all-encompassing model is that we’re wanting to avoid a pattern whereby these mixed in services (system-functions: not process tasks) lead to problems in segmentation, agility, operations, and maintenance.

I’m pointing out that a process-task is a measurable unit of work - meaning, per BPM terms, must have scope: a beginning and end whereby it’s instance, upon completion, delivers measured value into its parent process.

Here is a model containing an out-of-place task, one better suited as a system service or function:

Referring to this model’s system lane: task “Waiting for Loan Request” - This is an unbounded system function that belongs in an ESB or routing service because it offers no measurable value to overall process model (i.e. not a task). For example, this “task” simply starts and remains running as a service. It doesn’t go away upon process completion. Consequently we move “waiting for loan request” to an event processing service.

Here is the new home for "Waiting for Loan Request"

StephenOTT · November 18, 2016, 4:42pm

Thanks for sharing @garysamuelson Interesting insights! Read it a few times to absorb it.

And thanks as always @Webcyberrob; gave me some more ideas.

henning · September 21, 2017, 3:02pm

If workflows require a lot of polling, the best solution I came up with was creating a poll workflow, by using java reflection the “Request Status” block can be passed a java class which has an interface which returns a bool if its finished or else it will retry.
Because this poll flow is used with call activity I can set a timer boundary event to stop the infinite loop when it has been running for to long.

garysamuelson · October 17, 2017, 1:48pm

Apologies for the late follow-up on the discussion.

My experience with with generated events (timers, etc.) lead me to use the services provided by Quartz (via Apache Camel) or from misc. application servers (WildFly).

I used Apache Camel because I needed a framework reasonably separated from the BPM/Camunda deployment. By the time SDLC/Platform-specific requirements come into play, realized I needed something that readily consumed startup configuration files.

For example, I had a system integration (SIT) server requiring a much shorter interval, or more timer events, while a QA server didn’t want any events due to their need for a somewhat static set of scenarios with user-controlled (QA system) events.