Evangelia Kalyvianaki, City University
Abstract
Federated stream processing systems, which utilise nodes from multiple independent domains, can be found increasingly in multi-provider cloud deployments, internet of things systems, collaborative sensing applications and large-scale grid systems. To pool resources from several sites and take advantage of local processing, submitted queries are split into query fragments, which are executed collaboratively by different sites. When supporting many concurrent users, however, queries may exhaust available processing resources, thus requiring constant load shedding. Given that individual sites have autonomy over how they allocate query fragments on their nodes, it is an open challenge how to ensure global fairness on processing quality experienced by queries in a federated scenario. We describe THEMIS, a federated stream processing system for resource-starved, multi-site deployments. It executes queries in a globally fair fashion and provides users with constant feedback on the experienced processing quality for their queries. THEMIS associates stream data with its Source Information Content (SIC), a metric that quantifies the contribution of that data towards the query result, based on the amount of source data used to generate it. We provide the BALANCE-SIC distributed load shedding algorithm that aims to balance the SIC values of result data. Our extensive evaluation validates that the BALANCE-SIC algorithm yields balanced SIC values across queries as measured with the Jain's index fairness metric. Our approach also incurs a low execution time overhead.
About the speaker
Eva Kalyvianaki is a Lecturer (Assistant Professor) in the Department of Computer Science at City University London. Before this, she was a post-doctoral researcher in the Department of Computing, Imperial College London. She holds a Ph.D. from the Computer Laboratory (SRG/netos group) in Cambridge University and M.Sc.and B.Sc.degrees from the Computer Science Department of the University of Crete, Greece. Her interests span the areas of Cloud Computing, Data Stream Processing, Autonomic Computing, Distributed Systems and Systems Research in general.