Analytical queries virtually always involve aggregation and statistics. SQL offers a wide range of functionalities to summarize data such as associative aggregates, distinct aggregates, ordered-set aggregates, grouping sets, and window functions. In this talk, I outline a unified framework for advanced statistics that composes all flavors of complex SQL aggregates from low-level plan operators. These operators can reuse materialized intermediate results, which decouples monolithic aggregation logic and speeds up complex multi-expression queries. The contribution is therefore twofold: the framework modularizes aggregate implementations and outperforms traditional systems whenever multiple aggregates are combined. We integrated the approach into the high-performance database system Umbra and experimentally show that we compute complex aggregates faster than the state-of-the-art HyPer system.
Please email for a
André Kohn is currently pursuing a PhD in Computer Science in the Database Group at the Technical University of Munich. His research focuses on efficient and interactive data analytics with work on adaptive query compilation, modular aggregation, and scalable query processing with WebAssembly.