Hi Galen,
regarding re-run of tasks. I tend to use the following principles;
Make ‘transactions’ larger, eg aim for fewer check points (async continuations).
Use check points at natural boundaries. (These are also great suspend points)
If a service is not idempotent, then isolate it into a transaction on its own (eg async before & after).
In a distributed system you will always need to deal with failure. Thus even without the suspend behaviour you may still need to deal with re-run of tasks. Thus my strategy is minimise likelihood and consequence…
regards
Rob