Big Data Processing

By grt17 , 16 March 2020

Window aggregation queries are a core part of streaming applications. To support window aggregation efficiently, stream processing engines face a trade-off between exploiting parallelism (at the instruction/multi-core levels) and incremental computation (across overlapping windows and queries). Existing engines implement ad-hoc aggregation and parallelization strategies. As a result, they only achieve high performance for specific queries depending on the window definition and the type of aggregation function.

By scs17 , 6 December 2019

Faasm is a high-performance stateful serverless runtime. The goal of the project is enabling fast, efficient parallel applications in serverless.

Faasm provides multi-tenant isolation, but also lets functions share regions of memory. These shared memory regions give low-latency concurrent access to data, supporting high-performance distributed serverless applications.

By paublin , 2 June 2016

Real-time stream data has begun to play an increasingly important role on the Internet. One of the causes for this is the proliferation of geographically-distributed stream data sources such as sensor networks, scientific instruments, pervasive computing environments and web feeds connected to the Internet. Potentially millions of users world-wide want to take advantage of the availability of this data. Therefore they require a convenient way to process real-time stream data at a global scale through applications that perform Internet-scale stream processing (ISSP).

By wculhane , 13 April 2016

Stream processing has witnessed an uptake in modern real-time data analytics applications, ranging from credit fraud detection to clickstream analysis. The key challenge for these applications is to process a continuous stream of input records (e.g. credit card transactions, click and search term logs) at a high rate and low latency.

By wculhane , 11 April 2016

One well publicized promise of the information age is expanding our ability to develop richer analysis of big data in search of underlying information. To this end there has been a focus on the ability to handle larger amounts of data in hopes that processing more data provides more information. While this has produced some amazing tools, it ignores one of the main dimensions of creating in depth analysis – the variability of the analysis function.

By wculhane , 11 April 2016

The ITA-DSM project investigates the design of data stream management systems (DSMSs) for real-time data intensive applications in non-traditional environments such as mobile ad-hoc and wireless sensor networks. In these environments, connectivity to backend cloud infrastructure can be intermittent, bandwidth-constrained or in the worst case unavailable. Instead of offloading DSM to a cloud backend, the project seeks to exploit the growing computational capabilities of modern mobile phones and IoT devices by performing DSM in-situ using the combined resources of multiple devices.

By admin , 20 March 2016

Investigates a new stateful processing model for data-parallel "big data" applications.