DISSP | LSDS - Large-Scale Data & Systems Group, Imperial College London

Real-time stream data has begun to play an increasingly important role on the Internet. One of the causes for this is the proliferation of geographically-distributed stream data sources such as sensor networks, scientific instruments, pervasive computing environments and web feeds connected to the Internet. Potentially millions of users world-wide want to take advantage of the availability of this data. Therefore they require a convenient way to process real-time stream data at a global scale through applications that perform Internet-scale stream processing (ISSP). Similar to the ease of relational queries in DBMS, stream-processing systems allow users to access and manipulate distributed data streams through declarative queries.

The scale of an Internet-wide system poses substantial challenges when it comes to providing a dependable service. Any such system must gracefully handle the failure of network links and processing hosts while managing a large pool of CPU and network resources.

For example, astronomers want to detect transient sky events, such as gamma-ray bursts, in real-time. To detect such events, they must correlate real-time image streams from geographically-distributed radio telescopes. These events only last for minutes and, after an event has been detected, instruments need to be re-aligned to focus on on-going occurrences.

Overview

The DISSP project investigates how to build dependable Internet-scale stream-processing systems for interconnecting tomorrow's pervasive sensor systems and global scientific experiments. We argue that Internet-scale stream processing needs new models for achieving dependability. Achieving dependability in this context is a significant challenge for several reasons: (1) failure will be the common case in the system. Due to its size, a fraction of Internet paths and hosts will be unavailable at any time; (2) the real-time nature of the data means that there is little time for recovery from failure; (3) a shared infrastructure, such as an ISSP system, will experience high utilisation. Consequently the additional resource demand during recovery can overload the system. The traditional wisdom of substantially over-provisioning a system to compensate for failure is infeasible in such a shared, federated platform.

Therefore we believe that we need to depart from the hard dependability guarantees of traditional DBMSs and today's stream-processing systems. Ensuring no tuple loss at all times may be feasible within a single data centre, but we cannot hope to achieve this at an Internet-scale. Instead, we explore dependability guarantees that are driven by application requirements. Many sensing applications can cope with a controlled degradation of result quality. While result quality is reduced, the system provides constant feedback to users on the achieved level of service. Feedback is expressed in a domain-specific way, e.g., by notifying a scientific user about the reduction in detection confidence of events of interest. This feedback also drives an adaptive fault-tolerance mechanism allowing the DISSP system to strategise about resource allocation in order to minimise the reduction in service quality of a maximum number of users.

Objectives

The objectives of the DISSP project are to:

investigate and develop a novel reliability model for DISSP that includes user-perceived quality of query results and provides feedback to users on quality degradation due to unmaskable network and host failures;
provide adaptive fault-tolerance mechanisms for DISSP systems that select appropriate strategies for maximising result quality over the lifetime of potentially long-running queries (eg over several months);
design, implement and evaluate a scalable prototype system for DISSP using controlled experiments on the Emulab network testbed; and
deploy an open, global and shared platform for DISSP as a public service on the PlanetLab research network and thus to facilitate and encourage the use of DISSP across research communities.

DISSP: Dependable Stream Processing Systems

Overview

Objectives

Related Publications