Track: Stream Processing @ Scale


Day of week:

The current growth of data volumes and connectivity are driving the active development of stream processing. We frequently see demands to process more data faster to get near real-time insights and decisions. Lambda and Kappa architectures are some of the subsequent popular results. Technologies like Apache Spark and Apache Flink are exciting frameworks at our disposal to implement these or other paradigms.

Stream processing has many facets from architecture for scale and resilience to processing paradigms like micro-batching versus event-driven. The Stream processing at scale track focuses on bringing together current knowledge, and developments in the area from Internet giants to Open Source projects.

Track Host:
Christian Prokopp
Engineering Director @BigDataExperts
Christian Prokopp, Ph.D., is the Engineering Director at Big Data Partnership, a London-based consulting firm. Christian develops Big Data strategies, architectures, best practices and products for diverse clients, e.g. finance, manufacturing, media, retail and government. He helps large organisations navigate their Big Data journey from strategy to hands-on implementation of large-scale infrastructure projects. Previously, Christian worked as a Software Engineer, Data Engineer, Data Scientist, Architect and Principal Consultant. In his last role, he handled the design and implementation of a fully cloud-based infrastructure for an e-commerce analytics startup processing hundreds of millions of products until being acquired by Google. In his spare time, he writes and speaks about Big Data or travels the world.
10:35am - 11:25am

by Alexey Kharlamov
VP of Technology @IntegralAdScience

Modern data streaming systems process millions of messages per second. To extract value from their data, organizations employ horizontally scalable distributed event processors such as Apache Storm. Such architectures are frequently designed under the assumption that data loss and calculation errors are acceptable.

In other cases, Kappa architecture is used to fulfill performance requirements without sacrificing consistency and reliability. And given the typical data consistency and...

11:50am - 12:40pm

by Robert Metzger
PMC member and committer Apache Flink project

Data streaming is gaining popularity, as more and more organizations are realizing that the nature of their data production is continuous and unbounded, and can be better served with a streaming architecture. Streaming architectures promise decreased latency from signal to decision, a radically simplified data infrastructure architecture, and the ability to cope with new data that is generated continuously. Apache Flink is a full-featured true stream processing framework with:

  • ...
1:40pm - 2:30pm

by Ben Stopford
Core Kafka team @Confluent

The world of Microservices is a little different to standard service oriented architectures. They play to an opinionated score of decentralisation, isolation and automation. Stream processing comes from a different angle though. One where analytic function is melded to a firehose of in-flight events. Yet business applications increasingly need to be data intensive, nimble and fast to market. This isn’t as easy as it sounds.

This talk will look at the implications of mixing toolsets...

2:55pm - 3:45pm

by Manuel Fahndrich
SE @Google working on horizontal auto-scaling batch/streaming pipelines

Resource allocation and tuning of large data-parallel pipelines has traditionally been a manual process based on human oversight and as such is costly, wasteful, and high latency. Pipelines might see spikes in input rates, organic traffic growth, or fall behind due to outages or throttling of other services. Typically, such variation forces operators to either overprovision their resources for the worst-case, or to manually monitor and adjust resources when necessary. Both of these...

4:10pm - 5:00pm

by Richard Kasperowski
Author of The Core Protocols: A Guide to Greatness

Open Space
5:25pm - 6:15pm

by Sudhir Tonse
Engineering Manager @Uber - Marketplace Data & Forecasting

Uber is making rapid strides as a thriving Marketplace/Logistics platform. The Marketplace system consists of a set of services that are responsible for handling rider requests and other fulfillment requests from UberEats, Rush etc. To make our Marketplace system efficient and intelligent, we need to extract deep and timely insights from our carefully curated data. We also have to make the insights easily accessible for both people (Operations, Data Scientists and Engineers) and Machines to...


Covering innovative topics

Monday, 7 March

Tuesday, 8 March

Wednesday, 9 March