Names:Phillip Allen allenpd@clarkson.edu
Matthew Finlayson finlayms@clarkson.edu
School: Clarkson University, Potsdam, NY
Category of Project: 'Other Challenge' Development Library
Threadpool: A reusable pool of general-purpose threads
Full text: http://www.clarkson.edu/~allenpd/threadpool/paper.html
Part 1:
The current trend in computing technology is towards creating robust servers for client server applications, such as scientific computing or the World Wide Web. Many server applications are multithreaded; unfortunately, thread programming is difficult [3]. One way to reduce the complexity of these programs is to use threads as a work crew [2]. Since servers typically handle a lot of traffic, a server could potentially spawn a large number of threads on the fly. This can have a large overhead if it is preformed on short lived operations. A better solution is to create a pool of threads before hand. Each time a new client task enters the system, an idle thread will wakeup and perform the task. This strategy is particularly effective because it amortizes the thread creation/destruction overhead over many client tasks and reduces the instruction cost of starting the client task, after the server initializes, to a mere procedure call. This method of thread organization is called a threadpool. The goal of our project is to create a generic threadpool library for POSIX threads (pthreads).
We built the threadpool library to satisfy the following requirements for a threadpool are as follows:
The only additional requirement on the threadpool is that the programmer does not attempt to execute threadpool functions through the dispatch of threadpool. Any such function calls will immediately return with no code executed.
Part 2:The threadpool is written in C and uses pthreads (POSIX threads) to handle threading; it ports without change to many flavors of UNIX. There are essentially two parts. First is a queue that is dynamic in size, instead of destroying a node after use with the free command, it stores the unused nodes in a separate queue. This removes the overhead of repeated malloc and free calls. The second part is the actual threadpool functions; with the use of a mutex and three condition variables, there are no deadlock or busy-wait conditions and the code behaves as we designed it to. The total amount of code, without comments or debugging statements, is only 400 lines, half of which is the customized queue. To add a work crew to a single threaded server only takes three lines, which should greatly ease the task of application developers.
Part 3:The threadpool was tested using a very basic server application; it accepted TCP/IP requests, did an amount of work set at server creation, and then responded to the request. The clients were simple applications that made a request to the server and then waited for a response. If the server dropped a request, it would close down its connection and start a new request. The network connecting the test machines is a switched Ethernet running at a minimum of 100 Mbps; tests were conducted at night in order to minimize the effect of other traffic. The time to send one packet across this connection was 2.230ms.
Our tests ranged from I/O intensive to computation intensive applications; in all cases, the addition of threads to the threadpool showed superior performance to the monolithic implementation. For a complete description of the machines used for testing and the results of our tests, please visit: http://www.clarkson.edu/~allenpd/threadpool/testing.html
Our threadpool provides a simple interface for an application writer to create multithreaded applications. People tend to implement their own threadpools when they need them. It is hard to get a threadpool right without extensive testing and proofs of correctness. Therefore, we believe a solid, fully tested version of a threadpool is a valuable asset to the open source community.
References: