Workshop: [SOLD OUT] Real-Time Streaming Data Pipelines

Location:

Level: 
Beginner

When:

9:00am - 4:00pm

Prerequisites

  • Basic knowledge of Apache Kafka concepts is very helpful, but is not required. Kafka Streams applications are written in Java, but sample solutions will be provided for those not familiar with the language.
  • To fully participate, you should bring a laptop with VirtualBox 5.x installed.
  • Our VM requires 3GB of RAM, so your laptop should have at least 4GB of RAM installed, and must be able to run a 64-bit virtual machine (so VT-X should be enabled in the BIOS on Windows machines).
  • Your laptop should have at least 15GB of free disk space. A link to download the VM will be provided two weeks prior to the event.

This workshop is sold out.

In this workshop, we will show how Kafka Connect and Kafka Streams can be used together to build a real-world, real-time data pipeline. Using Kafka Connect, we will ingest data from a relational database into Kafka topics as the data is being generated. We then will process and enrich the data in real time using Kafka Streams, before writing it out for further analysis.

We’ll see how easy it is to use Connect to ingest and export data (no code is required), and how the Kafka Streams Domain Specific Language (DSL) means that developers can concentrate on business logic without worrying about the low-level plumbing of streaming data processing. Because Streams is a Java library, developers can build real-time applications without needing a separate cluster to run an external stream processing framework.

Key takeaways:

  • Configure Kafka Connect to move data between external systems and Apache Kafka
  • Write a real-time stream processing application using the Kafka Streams DSL
  • See how easy it is to scale Connect and Streams as your data volume increases

Speaker: Ian Wrigley

Director, Education Services @Confluent

Other Workshops:

Tracks