Mercury: Hybrid Centralized and Distributed Scheduling in Large Shared Clusters
Konstantinos Karanasos, Microsoft Research
Datacenter-scale computing for analytics workloads is increasingly common. High operational costs force heterogeneous applications to share cluster resources for achieving economy of scale. Scheduling such large and diverse workloads is inherently hard, and existing approaches tackle this in two alternative ways: 1) centralized solutions offer strict, secure enforcement of scheduling invariants (e.g., fairness, capacity) for heterogeneous applications, 2) distributed solutions offer scalable, efficient scheduling for homogeneous applications. We argue that these solutions are complementary, and advocate a blended approach. Concretely, we propose Mercury, a hybrid resource management framework that supports the full spectrum of scheduling, from centralized to distributed. Mercury exposes a programmatic interface that allows applications to trade-off between scheduling overhead and execution guarantees. Our framework harnesses this flexibility by opportunistically utilizing resources to improve task throughput. Experimental results on production-derived workloads show gains of over 35% in task throughput. These benefits can be translated by appropriate application and framework policies into job throughput or job latency improvements. We have implemented and are currently contributing Mercury as an extension of Apache Hadoop / YARN. This work will appear in USENIX ATC and is a joint work with Sriram Rao, Carlo Curino, Chris Douglas, Kishore Chaliparambil, Giovanni Fumarola, Solom Heddaya, Raghu Ramakrishnan, and Sarvesh Sakalanaga.
About the speaker
Konstantinos Karanasos joined the Cloud and Information Services Lab (CISL) at Microsoft as a Senior Scientist in March 2014, where he is working on cloud-scale cluster resource management, distributed data platforms, and query processing and optimization. He is currently based in Cambridge, UK, where he is also collaborating with the Systems & Networking group of Microsoft Research. Prior to joining Microsoft, Konstantinos was a postdoctoral researcher at IBM Almaden Research Center, where he was member of the Big Data analytics group. At Almaden, he was working on Jaql, a platform for analyzing large datasets in parallel using Hadoop’s MapReduce framework. He built a system that extended Jaql by adding dynamic optimization capabilities to it. Part of his work at Almaden was transferred to IBM BigInsights product. Konstantinos obtained his PhD in Computer Science from Inria, France, working on view-based techniques for semi-structured data, under the supervision of Ioana Manolescu and Francois Goasdoue. Prior to that, he received his Diploma in Electrical and Computer Engineering from the National Technical University of Athens, Greece, where he completed his Diploma thesis under the supervision of Timos Sellis.
Date & Time
Thursday, July 30, 2015 - 14:00
Huxley 218