Job Executor: Pool Sizes: Reason for default values?

Hi Stephen,

Yes - in a prior post, I used the analogy of an old steam train. The boiler is the thread pool, the stoker the job acquisition thread. You need to balance the rate at which the stoker shovels coal with the capacity of the boiler…In the case of a cluster, it can be as if you have two stokers working from a common pile of coal…

With job acquisition there is intrinsic DB overhead and in the case of a remote DB, a network round trip. Hence intuitively fetch a batch should achieve better throughput that a row at a time. In the case of a cluster, there is another dimension, job acquisition contention. It may be tempting to reduce the fetch size to reduce the chance of collisions, however this requires more frequent round trips. With more frequent round trips, the chance of overlapping acquisition tended to increase. Hence the job executors tended to back-off. Thus there is a sweet spot between frequency of acquisition and batch size…

regards

Rob