I am working on an application which spawns a new thread per request. Sometimes the number of threads active on the machine at one time is in the high hundreds. It’s suspected that this is causing all sorts of problems. I would like to experiment with using thread pool instead to see if this improves work throughput. But first, I have to convince the powers that be to allow me the time.
As part of my argument, I think a good analogy that compares the brute-force “launch x threads” method to the technique of thread pools would help. Is there a text-book analogy for this?
Not quite a text-book analogy:
Thread pooling for application is like recycling for communities.
Every time we’re thirsty we can go ahead and buy a new bottle of water made of all new matrials. It’s easy, relatively cheap for us and quite profitable for a manufacturer. Unfortunately it doesn’t scale:
In a similar way, creating a new thread each time from the ground up is wasteful (potentially slow), and it doesn’t have a safeguard. Too many threads could exhaust resources of an execution environment, just like too many plastic bottles could overwhelm and poison a natural habitat.
Threads in thread pools are often called “worker thread” for good reasons. They are like some workers in a company, ready to work on some “tasks”, each one taking some work space to work, each one having it’s queue of task to process.
Launching a thread just to execute a task is like finding the resources to hire someone specifically for this task, making her do it, then firing her.
Obviously, if the task is very long, rare and specific, there might be no trouble with this, but the more often this occur, the more expensive it is.
In this case, just putting the task on top of the task queue of a worker can be enough. If he take too much time to start working on it, another worker that will have no more tasks to do can steal the task from his queue. If no worker are available for some time then it means that either you don’t have enough workers, or it cannot be done fast with the available resources. In particular, if you already have the maximum amount of workers you can use without everybody walking on feet of others, like in a thread pool, then you can do nothing about it other than have a bigger place (better hardware) or reduce work load.
This analogy can also easily explain the cost of interruption and communication.
Imagine eating an entire package of Oreo cookies. The experience would be so much more enjoyable if you didn’t have to eat the dry chocolate wafers every time you want a delicious creamy center, right? Using thread pools is like stacking the centers of all those cookies between a single pair of wafers!
That analogy will be much more convincing if you profile your code and show how much time your machines spend starting and stopping threads (eating the wafers) vs. time spent on creamy centers (actually handling requests).
Take a look at libdispatch if you need an implementation.