Names:Phillip Allen allenpd@clarkson.edu
Matthew Finlayson finlayms@clarkson.edu

School: Clarkson University, Potsdam, NY

Category of Project: 'Other Challenge' Development Library

Threadpool: A reusable pool of general-purpose threads
Full text: http://www.clarkson.edu/~allenpd/threadpool/paper.html
Part 1:

The current trend in computing technology is towards creating robust servers for client server applications, such as scientific computing or the World Wide Web. Many server applications are multithreaded; unfortunately, thread programming is difficult [3]. One way to reduce the complexity of these programs is to use threads as a work crew [2]. Since servers typically handle a lot of traffic, a server could potentially spawn a large number of threads on the fly. This can have a large overhead if it is preformed on short lived operations. A better solution is to create a pool of threads before hand. Each time a new client task enters the system, an idle thread will wakeup and perform the task. This strategy is particularly effective because it amortizes the thread creation/destruction overhead over many client tasks and reduces the instruction cost of starting the client task, after the server initializes, to a mere procedure call. This method of thread organization is called a threadpool. The goal of our project is to create a generic threadpool library for POSIX threads (pthreads).

We built the threadpool library to satisfy the following requirements for a threadpool are as follows:

It must execute all dispatched functions exactly once.
2) When dispatch is called, the work to be done is placed into a queue and dispatch returns. The only time that dispatch blocks is when the queue uses at least 64kB of heap space and is full; in that case, the act of adding information to the queue will block. This prevents the program from crashing due to lack of heap space.
3) There can be no deadlock conditions in the threadpool. Experiencing deadlock due to threads is a potential occurrence but it should not be caused by a threadpool package itself.
4) There should be no busy-wait loops in the threadpool. A busy-wait loop has the tendency to be slow; in some cases, a two-phase wait is faster than blocking [1]; however, programming around busy-wait loops is just as effective.
5) A simple, clean API. To view the API please visit: http://www.clarkson.edu/~allenpd/threadpool/API.html

The only additional requirement on the threadpool is that the programmer does not attempt to execute threadpool functions through the dispatch of threadpool. Any such function calls will immediately return with no code executed.

Part 2:

The threadpool is written in C and uses pthreads (POSIX threads) to handle threading; it ports without change to many flavors of UNIX. There are essentially two parts. First is a queue that is dynamic in size, instead of destroying a node after use with the free command, it stores the unused nodes in a separate queue. This removes the overhead of repeated malloc and free calls. The second part is the actual threadpool functions; with the use of a mutex and three condition variables, there are no deadlock or busy-wait conditions and the code behaves as we designed it to. The total amount of code, without comments or debugging statements, is only 400 lines, half of which is the customized queue. To add a work crew to a single threaded server only takes three lines, which should greatly ease the task of application developers.

Part 3:

The threadpool was tested using a very basic server application; it accepted TCP/IP requests, did an amount of work set at server creation, and then responded to the request. The clients were simple applications that made a request to the server and then waited for a response. If the server dropped a request, it would close down its connection and start a new request. The network connecting the test machines is a switched Ethernet running at a minimum of 100 Mbps; tests were conducted at night in order to minimize the effect of other traffic. The time to send one packet across this connection was 2.230ms.

Our tests ranged from I/O intensive to computation intensive applications; in all cases, the addition of threads to the threadpool showed superior performance to the monolithic implementation. For a complete description of the machines used for testing and the results of our tests, please visit: http://www.clarkson.edu/~allenpd/threadpool/testing.html

Our threadpool provides a simple interface for an application writer to create multithreaded applications. People tend to implement their own threadpools when they need them. It is hard to get a threadpool right without extensive testing and proofs of correctness. Therefore, we believe a solid, fully tested version of a threadpool is a valuable asset to the open source community.

References:
1.Arpaci-Dusseau, Andrea C. Scheduling with Implicit Information in Distributed Systems. ACM Transactions on Computer Systems, Vol. 19, No 3, 2001,pp.283-331.
2.Birrell, Andrew D. An Introduction to Programming with Threads. Digital SRC Research Report 35.
3.Ousterhout, John. Why Threads Are A Bad Idea (for most purposes) Invited Talk at the 1996 USENIX Technical Conference (January 25, 1996).