sRoute: Treating the Storage Stack Like a Network [FAST'16]
Eno Thereska, Confluent and Imperial College
Abstract
In a data center, an IO from an application to distributed storage traverses not only the network, but also several software stages with diverse functionality. This set of ordered stages is known as the storage or IO stack. Stages include caches, hypervisors, IO schedulers, file systems, and device drivers. Indeed, in a typical data center, the number of these stages is often larger than the number of network hops to the destination. Yet, while packet routing is fundamental to networks, no notion of IO routing exists on the storage stack. The path of an IO to an endpoint is predetermined and hard-coded. This forces IO with different needs (e.g., requiring different caching or replica selection) to flow through a one-size-fits-all IO stack structure, resulting in an ossified IO stack. This talk proposes sRoute, an architecture that provides a routing abstraction for the storage stack. sRoute comprises a centralized control plane and “sSwitches” on the data plane. The control plane sets the forwarding rules in each sSwitch to route IO requests at runtime based on application-specific policies. A key strength of our architecture is that it works with unmodified applications and VMs. We show significant benefits of customized IO routing to data center tenants (e.g., a factor of ten for tail IO latency, more than 60% better throughput for a customized replication protocol and a factor of two in throughput for customized caching).
About the speaker
Eno Thereska is Member of Technical Staff at Confluent and an Honorary Research Fellow at Imperial. Previously he was a Senior Researcher at Microsoft Research. He has broad interests in computer systems. Eno has a PhD from Carnegie Mellon University.