Publications

GRANNY: Granular Management of Compute-Intensive Applications in the Cloud
Abstract
Parallel applications are typically implemented using multi-threading (with shared memory, e.g., OpenMP) or multi-processing (with message passing, e.g., MPI). While it seems attractive to deploy such applications in cloud VMs, existing cloud schedulers fail to manage these applications efficiently: they cannot scale multi-threaded applications dynamically when more vCPUs in a VM become available, and they cause fragmentation over time because of the static allocation of multi-process applications to VMs. We describe GRANNY, a new distributed runtime that enables the fine-granular management of multi-threaded/process applications in cloud environments. GRANNY supports the vertical scaling of multi-threaded applications within a VM and the horizontal migration of multi-process applications between VMs. GRANNY achieves both through a single WebAssembly-based execution abstraction: Granules can execute application code with thread or process semantics and allow for efficient snapshotting. GRANNY scales up applications by adding more Granules at runtime, and de-fragments applications by migrating Granules between VMs. In both cases, it launches new Granules from snapshots efficiently. We evaluate GRANNY with dynamic scheduling policies and show that, compared to current schedulers, it reduces the makespan for OpenMP workloads by up to 60% and the fragmentation for MPI workloads by up to 25%.
Venue
22nd USENIX Symposium on Networked Systems Design and Implementation (NSDI)
Publication Year
2025
Related Projects
Links