Hi all,
I’ve noticed intermittent token stalling (without incident/error) on an event/task that includes a start listener with javascript code. I tested the listener code as attached to a start event and a script task, with the same result. Sometimes the process executes this step successfully, and sometimes the token just stops there. When it stalls, I can see that the javascript did not execute properly, as the process instance variables are not updated (the variables referenced by the script are initially set with default values when the instance is started via API). Rebooting the camunda instance doesn’t move the token forward. The problem seems to go away for a little while if I make a change to the process and redeploy the process to camunda, and start a new instance, but eventually it comes back with later in future instances that are started.
Does anyone have any ideas about what might be causing this?
Update: I put a 3-second timer delay after the start event and now the token seems to not be stalling. I figure maybe the instance needs some time to start up before variables can be set, but that’s just a wild guess. Anyways, I hope this helps anyone who runs into this issue.
Does anyone have a better idea? I’d love to avoid adding an unecessary delay into the process.
It’s setting the value of 3 instance variables. For the bpm_action variable, it sets it to the string “01”. For the bpm_process_instance variable, it gets the process instance Id of the instance and sets it to that value. For the bpm_process_execution variable, it gets the instance Id and sets it to that value. These variables are needed later in the process for an httpconnector event.
It’s already setup like that (excluding the start event). I will add async before to the start event and test it. It will take a day or two to know if it works, since each time I update the bpmn file and deploy it, the problem goes away and then a day or two later it returns. I’ll report back.
Ok, so I’ve tested the adding ‘async before’ to each event/task, including the start event, and it’s stalling on the start event. It didn’t stall for a few days after deploying the process, and then it started stalling (which seems to be the pattern).
Here’s a screenshot. The 13 instances on the placeholder task are instances I started during the first couple days after deploying the process, and the 5 instances on the start event are from today. From what I’ve seen, once it starts stalling, it stays that way. This occurs on camunda v7.8 and v7.9.
Is there any other process defs running on your server?
The type of symptom you are describing is similar to what i have faced with the job executor become stuck/locked from tasks that are never completing / never timing out. In my case it was a bug/lack of timeout feature implemented in HTTP-Connector and thus if a network connection stalled and never timed-out, the job/task would be in the job executor forever.
Can you provide your actual BPMN file? The file you provide is stripped of the actual content/scripts that are being executed.
Yes, there is another process def running. It is the full version of the process def I’m working here on with you. That process def does have an http-connector and mail-connector.
I PM’d you a link to the xml doc for the full process def (didn’t want to post here, in case I wasn’t thorough enough in removing internal data).
Note that during my testing, the token didn’t pass into the subprocess labeled ‘PS03’, or the end event.
and retest. Make sure to add the timeout method in your connections. What we are testing is to see if you have network connections that are staying open forever thus the job that is executing your http-connector is remaining a active job and using up the worker pool.
Would also recommend that you move your Javascript into external files, so IDEs can inspect your JS rather than it being within the modeler bpmn code. Makes it easier to find issues when you have this many scripts
Another items for you: Generally you should not be sending signals or messages to yourself: as in do not send a signal/message from ProcessA to ProcessA. In your case you are doing this with your user task End script. There are weird effects and it does not work as one would think. I have found if you think you need to message “yourself” you usually have a BPMN design issue. consider abstracting into a different process and use a message event or Call activity.
Thanks for all the feedback! I will test those changes.
A couple of questions:
-The reason I was using signals is because I had a task that was repeatedly used, so I was attempting to re-use the same code across multiple End Listeners. Is abstracting into a different process (as you mentioned) the best approach, or is there better way that keeps everything inside one file?
-If I move javascript into external files, how do I deploy that easily? I’m currently deploying via curl API request, and it deploys the BPMN file only (as far as I can see).
Use Postman (getpostman.com) and deploy your other JS files along with bpmn file. They are just other files in your list of files
It depends on what you want to do: Looking at your example, its just a script you are executing, so it does not look like there is a lot of value in having it as a BPMN process or even a task. Its the equivalent to you calling a method in each of your other scripts. So i would likely add your JS into the classpath, and then load it, and call it from each of your other scripts that need to execute it.
Thanks @StephenOTT. I’m working on implementing these idea, they are very helpful. Should I replace the email-connector as well, or is this issue only potentially limited to the http-connector?
My challenge with sending via http is that is that I’m including process variables in the email body, and so I’m not sure how to setup a re-usable script that processes all variable values to create the final email body output (assuming I’m placing the email body content in an input parameter and extracting it’s contents via javascript). It seems like I’d have to create a new script for each email, whereas with email-connector I can just place the variables directly into the input field. Is there a better way?
So you’re saying the issue like doesn’t affect the email connector? They’re both ‘connectors’, so if the code is related to all connectors, then maybe it has the same problem?
Ok so I’ve followed your recommendations, and tested it for a few days now (the test process uses Jsoup for http requests and the mail connector for email sending). The problem has not returned, so it seems like the issue has been resolved. Thank you so much! Also, thanks for all of your additional tips, those were extremely helpful, and things are running much more efficient now.