Prof Asaf Cidon and Team Receive Jay Lepreau Best Paper Award at 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI '22)
EE Prof Asaf Cidon, CS PhD student Yuhong Zhong and team received a Best Paper Award at the 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2021) for the paper "XRP: In-Kernel Storage Functions with eBPF."
Storage devices are widely used to store and persist large amounts of data for many applications, such as databases, web search engines, and social networks. Kernel software provides the interface for applications to interact with storage devices safely and efficiently. However, with the rise of new high-performance memory technologies, such as 3D XPoint and low latency NAND, new storage devices can now achieve up to 7 GB/s bandwidth and latencies as low as 3 µs. At such high performance, the kernel software becomes a major source of overhead, impeding both application-observed latency and throughput.
Kernel bypass libraries (e.g., SPDK) provide a way to circumvent the high kernel software overhead, but they also forgo access control, require intrusive application changes, and waste CPU cycles when storage utilization is low. In contrast, the teams seeks a readily deployable mechanism that can provide fast access to emerging fast storage devices that requires no specialized hardware and no significant changes to the application while working with existing kernels and file systems. To this end, they design and implement XRP (eXpress Resubmission Path), a high-performance storage data path using Linux eBPF. Linux eBPF lets applications offload simple functions to the kernel, which eliminates most of the kernel software overhead while ensuring isolation and low CPU cycle waste.
Linux eBPF has already been widely used in networking. However, using Linux eBPF to speed up storage introduces several unique challenges. Unlike existing networking use cases, where each eBPF function can operate in a self-contained manner on a particular packet, a storage eBPF function may need to synchronize with other concurrent application-level operations or require multiple function calls to traverse a large on-disk data structure.
To address this challenge, the team studies many production databases and find that most data structures used by them – such as on-disk B-trees, log-structured merge trees, and log segments – are typically implemented on a small set of large files, and they are updated orders of magnitude less frequently than they are read. Based on these insights, the team exclusively focuses XRP on operations contained within one file and maintain a minimal amount of the file system mapping state in XRP, which they term the metadata digest. This allows XRP to safely support some of the most popular on-disk data structures.
A recent study shows that about 19% of CPU cycles are spent on the kernel software overhead in datacenters, and this number will increase further as networking and storage devices become faster. The kernel software overhead not only slows down applications but also consumes more energy and results in higher total costs of ownership. XRP is the first system that adopts Linux eBPF to reduce the kernel software overhead for storage. It achieves high throughput and low tail latency while leaving intact the kernel's existing isolation, security, and safety guarantees. The team open-sourced all the artifacts of XRP to enable more research in this field.
In addition, following the recent trend of offloading computations to network and storage devices, XRP can be extended to run storage eBPF functions on energy-efficient devices. By offloading application logics directly into smart storage devices, XRP has the potential to achieve even greater speedups and reduce energy consumption in datacenters.
- XRP: In-Kernel Storage Functions with eBPF
Yuhong Zhong Columbia University, Haoyu Li Columbia University, Yu Jian Wu Columbia University, Ioannis Zarkadas Columbia University, Jeffrey Tao Columbia University, Evan Mesterhazy Columbia University, Michael Makris Columbia University, Junfeng Yang Columbia University, Amy Tai Google, Ryan Stutsman University of Utah; Asaf Cidon Columbia University