Visible to the public Scalable Kernel TCP Design and Implementation for Short-Lived Connections

TitleScalable Kernel TCP Design and Implementation for Short-Lived Connections
Publication TypeConference Paper
Year of Publication2016
AuthorsLin, Xiaofeng, Chen, Yu, Li, Xiaodong, Mao, Junjie, He, Jiaquan, Xu, Wei, Shi, Yuanchun
Conference NameProceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems
PublisherACM
Conference LocationNew York, NY, USA
ISBN Number978-1-4503-4091-5
Keywordsmulticore system, operating system, pubcrawl170201, TCP/IP
Abstract

With the rapid growth of network bandwidth, increases in CPU cores on a single machine, and application API models demanding more short-lived connections, a scalable TCP stack is performance-critical. Although many clean-state designs have been proposed, production environments still call for a bottom-up parallel TCP stack design that is backward-compatible with existing applications. We present Fastsocket, a BSD Socket-compatible and scalable kernel socket design, which achieves table-level connection partition in TCP stack and guarantees connection locality for both passive and active connections. Fastsocket architecture is a ground up partition design, from NIC interrupts all the way up to applications, which naturally eliminates various lock contentions in the entire stack. Moreover, Fastsocket maintains the full functionality of the kernel TCP stack and BSD-socket-compatible API, and thus applications need no modifications. Our evaluations show that Fastsocket achieves a speedup of 20.4x on a 24-core machine under a workload of short-lived connections, outperforming the state-of-the-art Linux kernel TCP implementations. When scaling up to 24 CPU cores, Fastsocket increases the throughput of Nginx and HAProxy by 267% and 621% respectively compared with the base Linux kernel. We also demonstrate that Fastsocket can achieve scalability and preserve BSD socket API at the same time. Fastsocket is already deployed in the production environment of Sina WeiBo, serving 50 million daily active users and billions of requests per day.

URLhttp://doi.acm.org/10.1145/2872362.2872391
DOI10.1145/2872362.2872391
Citation Keylin_scalable_2016