Data stream processing systems process data from high velocity data sources and try to fulfill strict requirements regarding system throughput and end to end latency. A major challenge for such systems is the ability to handle unpredictable load peaks. To that end most systems use overprovisioning. However, this causes low system utilization and high monetary cost for the user. A potential solution to this problem is to make a system elastic. An elastic system adapts the number of virtual machines used based on the current workload. The two major design decisions in building such a system are the state management and the definition of the used scaling strategies.
In this talk we will focus on the second issue and discuss the problem of choosing a good scaling strategy, which should minimize the resources used and at the same time ensure the expected quality of service, e.g., the end to end latency. We will present two contributions to solve this issue, namely (1) a latency-aware scaling strategy and (2) an online parameter optimization to automatically identify a good configuration for a scaling strategy. Both approaches have been implemented in a state of the art data stream processing engine and outperform existing scaling strategies in different real-world scenarios.