The only scalable thing for "Internet Services" I know is non-blocking I/O, and just one thread for every CPU physical thread (i.e. avoiding thread context switching).
Edit: After RTFM, is the purpose of the paper, simulating threads handling blocking I/O with a epoll and/or select. Great paper.
Edit: After RTFM, is the purpose of the paper, simulating threads handling blocking I/O with a epoll and/or select. Great paper.