Assignment 12

Read Flash: An Efficient and Portable Web Server, by Pai, Druschel, and Zwaenepoel (reading #7). This paper was published at the USENIX conference, and like most papers published in that venue, describes the implementation of a system in great detail. As you read the paper, try not to get bogged down in the myriad details presented by the authors, such as which version of FreeBSD the test machines were running and how much memory was installed. Instead, focus on the stated problem: creating a high-performance and portable Web server. The paper compares four different server architectures: MP, MT, SPED, and AMPED. These architectures differ mainly in how they achieve concurrency; that is, how they are able to process a new request without waiting for previous requests to complete. Make sure you understand the paper's idea of concurrency; it's different from, for example, computational concurrency in a machine with multiple processors. For each architecture, try to figure out:

  • How does it perform when serving files from memory?
  • How does it perform when serving files from disk?
  • How portable is it?

Since you probably don't have the experience that the authors expect from their readers, here are a few things to keep in mind:

  • The authors talk about "asynchronous" or "non-blocking" I/O operations. A read operation on a network socket, for example, is called "blocking" if the system call does not return until data is available to read. A non-blocking read operation returns immediately, whether or not data is available; the application can use the select system call to wait until data is available on any socket. If a system uses blocking operations, how can it achieve concurrency?

  • There are two kinds of threads: user threads and kernel threads. With user threads, the scheduler is part of the user program, and the kernel is not aware of the multiple threads; with kernel threads, the kernel handles scheduling and keeps track of each thread. In this paper, the authors are talking about kernel threads. The main advantage of a kernel thread is that when one thread blocks on a system call, other threads can continue.

  • When this paper was written, support for kernel threads was not available on every OS. How did this impact Flash's design? Since that time, kernel threads have become more generally available, but they are still not well standardized. If you are interested in the state-of-the art, read the man pages for rfork and clone. FreeBSD uses rfork, and Linux® uses clone. Note that both were borrowed from the experimental OS Plan 9.

  • Processes are generally more expensive than threads. Processes take more time to create, and occupy more memory. The kernel takes less time to perform a context switch between two threads than between two processes. What key architectural differences between processes and threads cause these performance differences?

  • The AMPED design seems to dedicate a lot of complexity to performing asynchronous disk I/O. As a thought exercise, consider how you would "fix" the select syscall to allow asynchronous disk I/O. How would this "fixed" select simplify the design of Flash?