DISSP: Dependable Stream Processing Systems

Real-time stream data has begun to play an increasingly important role on the Internet. One of the causes for this is the proliferation of geographically-distributed stream data sources such as sensor networks, scientific instruments, pervasive computing environments and web feeds connected to the Internet. Potentially millions of users world-wide want to take advantage of the availability of this data. Therefore they require a convenient way to process real-time stream data at a global scale through applications that perform Internet-scale stream processing (ISSP). Similar to the ease of relational queries in DBMS, stream-processing systems allow users to access and manipulate distributed data streams through declarative queries.

The scale of an Internet-wide system poses substantial challenges when it comes to providing a dependable service. Any such system must gracefully handle the failure of network links and processing hosts while managing a large pool of CPU and network resources.

For example, astronomers want to detect transient sky events, such as gamma-ray bursts, in real-time. To detect such events, they must correlate real-time image streams from geographically-distributed radio telescopes. These events only last for minutes and, after an event has been detected, instruments need to be re-aligned to focus on on-going occurrences.

Overview

The DISSP project investigates how to build dependable Internet-scale stream-processing systems for interconnecting tomorrow's pervasive sensor systems and global scientific experiments. We argue that Internet-scale stream processing needs new models for achieving dependability. Achieving dependability in this context is a significant challenge for several reasons: (1) failure will be the common case in the system. Due to its size, a fraction of Internet paths and hosts will be unavailable at any time; (2) the real-time nature of the data means that there is little time for recovery from failure; (3) a shared infrastructure, such as an ISSP system, will experience high utilisation. Consequently the additional resource demand during recovery can overload the system. The traditional wisdom of substantially over-provisioning a system to compensate for failure is infeasible in such a shared, federated platform.

Therefore we believe that we need to depart from the hard dependability guarantees of traditional DBMSs and today's stream-processing systems. Ensuring no tuple loss at all times may be feasible within a single data centre, but we cannot hope to achieve this at an Internet-scale. Instead, we explore dependability guarantees that are driven by application requirements. Many sensing applications can cope with a controlled degradation of result quality. While result quality is reduced, the system provides constant feedback to users on the achieved level of service. Feedback is expressed in a domain-specific way, e.g., by notifying a scientific user about the reduction in detection confidence of events of interest. This feedback also drives an adaptive fault-tolerance mechanism allowing the DISSP system to strategise about resource allocation in order to minimise the reduction in service quality of a maximum number of users.

Objectives

The objectives of the DISSP project are to:

  • investigate and develop a novel reliability model for DISSP that includes user-perceived quality of query results and provides feedback to users on quality degradation due to unmaskable network and host failures;
  • provide adaptive fault-tolerance mechanisms for DISSP systems that select appropriate strategies for maximising result quality over the lifetime of potentially long-running queries (eg over several months);
  • design, implement and evaluate a scalable prototype system for DISSP using controlled experiments on the Emulab network testbed; and
  • deploy an open, global and shared platform for DISSP as a public service on the PlanetLab research network and thus to facilitate and encourage the use of DISSP across research communities.

Funder
EPSRC (2008-2012)
Categories
Team
Eva Kalyvianaki (City University London, UK)
Marco Fiscato (SwiftKey, UK)
Quang Hieu Vu (EBTIC, UAE)

Related Publications

Evangelia Kalyvianaki, Marco Fiscato, Theodoros Salonidis, and Peter Pietzuch
ACM International Conference on Management of Data (SIGMOD), 2016
San Francisco, CA, USA
Evangelia Kalyvianaki, Themistoklis Charalambous, Marco Fiscato, and Peter Pietzuch
7th IEEE International Workshop on Feedback Computing (Feedback Computing), 2012
San Jose, CA, USA
Javier Cervino, Evangelia Kalyvianaki, Joaquin Salvachua, and Peter Pietzuch
7th International Workshop on Self Managing Database Systems (ICDEW), 2012
Washington, DC, USA
Wilhelm Kleiminger, Evangelia Kalyvianaki, and Peter Pietzuch
6th IEEE International Workshop on Self Managing Database Systems (SMDB), 2011
Evangelia Kalyvianaki, Wolfram Wiesemann, Quang Hieu Vu, Daniel Kuhn, and Peter Pietzuch
IEEE International Conference on Data Engineering (ICDE), 2011
Hannover, Germany
Wilhelm Kleiminger, Evangelia Kalyvianaki, and Peter Pietzuch
6th International Workshop on Self Managing Database Systems (ICDEW), 2011
Hannover, Germany
Marco Fiscato, Quang Hieu Vu, and Peter Pietzuch
7th International Workshop on Quality in Database (QDB), 2009
Lyon, France
Nicholas Poul Schultz-Moeller, Matteo Migliavacca, and Peter Pietzuch
International Conference on Distributed Event-Based Systems (DEBS), 2009
Nashville, TN, USA
Peter Pietzuch
Proceedings of the 2nd Workshop on Dependable Distributed Data Management in conjunction with EuroSys’08 (WDDDM), 2008
Glasgow, United Kingdom