Track:

Location:

Churchill, G flr.

Duration

Duration:

4:10pm - 5:00pm

Day of week:

Monday

Key Takeaways

Learn approaches to keeping datastores in sync in the face of failure and latency.
Explore event streams and Kafka use cases.
Understand consistency guarantees and trade offs with keeping multiple data stores sync’d.

Abstract

For the very simplest applications, a single database is sufficient, and then life is pretty good. But as your application needs to do more, you often find that no single technology can do everything you need to do with your data. And so you end up having to combine several databases, caches, search indexes, message queues, analytics tools, machine learning systems, and so on, into a heterogeneous infrastructure...

Now you have a new problem: your data is stored in several different places, and if it changes in one place, you have to keep it in sync in the other places, too. It's not too bad if all your systems are up and running smoothly, but what if some parts of your systems have failed, some are running slow, and some are running buggy code that was deployed by accident?

It's not easy to keep data in sync across different systems in the face of failure. Distributed transactions and 2-phase commit have long been seen as the "correct" solution, but they are slow and have operational problems, and so many systems can't afford to use them.

In this talk we'll explore using event streams and Kafka for keeping data in sync across heterogeneous systems, and compare this approach to distributed transactions: what consistency guarantees can it offer, and how does it fare in the face of failure?

Interview

Question:

What is your main role today?

Answer:

At the moment, in half of my time I am writing "Designing Data-Intensive Applications" — which has been quite a long project, but I’m gradually getting towards the end of it. In the other half of my time, I am at the University of Cambridge, working on a research project that is figuring out how to join up databases with security research.

What we are trying to do is the following: imagine you wanted to build Google Docs, but in a way that you don’t have to trust Google’s servers. We can do this using end-to-end encryption between the devices that are collaborating on a document, but still allowing the same kind of real-time collaboration that you get with Google Docs. We are trying to figure out how to build the fundamental infrastructure that would make it easy for people to write this kind of real-time collaborative apps with end-to-end security.

Question:

What is the main motivation for your talk?

Answer:

The problem I want to address is the issue of keeping several datastores in sync with each other. The traditional way of handling that, if you want any sort of consistency guarantees, has been to use two-phase commit: distributed transactions across different stores. However, such transactions have all sorts of operational problems. The alternative has been to use eventual consistency everywhere, which has better performance, but failure modes that are very hard to reason about. It’s easy to end up in a situation where you data is wrong and you don’t even realise it.

What I see in event-driven systems is a way to get fairly strong guarantees by making sure that the writes to different data stores always happen in the same order. It’s fundamentally a very simple idea, although actually putting it into practice is still a fair bit of work.

Question:

How deep are you diving into it, and how are you shaping that discussion?

Answer:

I will introduce it using Kafka as an implementation method. This is not intended as a pitch for Kafka specifically, but it happens to be one of the best-suited tools for the purpose. I will summarize Kafka briefly, the architecture, how it works internally for those who have not seen it before, because it is very different from traditional message queues like RabbitMQ or ActiveMQ.

In Kafka, once you’ve published your messages to a topic and you have got several people subscribing or consuming from it, all subscribers are going to see the same messages for a particular partition in the same order. That is a guarantee that Kafka provides, and this ordering guarantee has a whole range of consequences. Failure handling becomes a lot simpler, because the consumers can just keep a checkpoint of their offset, indicating which messages they have seen and which they haven’t seen. Since the ordering stays the same, you can replay the history when recovering from a failure.

And there are performance benefits, because now there is much less work for the message brokers to do. They don’t need to keep track of the state of every single consumer and which messages they have acknowledged and which they haven’t, so you actually get much better throughput on the brokers as well. But most importantly, if you have this ordering guarantee, then you can write the messages to different datastores and make sure that once all the messages have been processed, the end result will be consistent with each other across all of these different stores.

You could see this as a scalable implementation of event sourcing. That is one way of phrasing it for people who are already familiar with event sourcing.

Building a Modern Security Engineering Team

Real-Time Fraud Detection with Graphs

Successful Go program design, 6 years on

Rust: Systems Programming for Everyone

Using Pony for Fintech

Effortless Eventual Consistency with Weave Mesh

GoshawkDB: Making time with Vector Clocks

Distributed Consensus: Making Impossible Possible

Tracks

Covering innovative topics

Monday, 7 March

Back to Java

What to expect in Java 9 and Spring 5
Stream Processing @ Scale

Big data, fast-moving data. Practical implementation lessons on Real-time Data
DevOps & CI/CD

Lessons/stories on optimizing the deployment pipeline
Head-to-Tail Functional Languages

Free-range Monads, Tackling immutability, tales from production, and more...
Architecting for Failure

Your system will fail. Take control before it takes you with it
21st Century Culture from Geeks on the Ground

New ways to organise technology companies and workplace culture

Tuesday, 8 March

Architectures You've Always Wondered about

In-depth technical case studies from giants like: Microsoft, Netflix, Google, Twitter, and more...
Close to the Metal

Get efficiency back into your code, concepts like: cache efficient algorithm and lock free data structures
Containers (in production)

Real-world lessons on scalability and reliability in production container deployments
Modern CS in the real world

Real-world Industry adoption of modern CS ideas
Security, Incident Response & Fraud Detection

Master-level classes on building security into your system and responding to incidents when things go wrong.
Optimizing You

Keeping life in balance is always a challenge. Learning lifehacks

Wednesday, 9 March

Disrupting Finance

Technology advances in finance (blockchain, P2P, Machine Learning, API's)
Modern Native Languages

Modern native languages: Safe efficiency with Go, Rust, Swift
Full Stack Javascript

Level up Javascript with topics like Angular, React/ReactNative, Node, Mongo/Couch/Other, Falcor, GraphQL, etc
Data Science & Machine Learning Methods

A developer's data science and machine learning toolkit
Microservices for Mega-Architectures

Practical lessons on Microservices success.
Modern Agile Development

Revisiting Agile today and tackling challenges we are seeing in the wild

FULL SCHEDULE

Location:

Duration

Day of week:

Key Takeaways

Abstract

Interview

Find Martin Kleppmann at

Similar Talks

Tracks

Covering innovative topics

Monday, 7 March

Tuesday, 8 March

Wednesday, 9 March

Conference for Professional Software Developers

Follow QCon

Contact

Menu

QCons around the World

Presentation: Staying in Sync: From Transactions to Streams

Location:

Duration

Day of week:

More talks on:

Key Takeaways

Abstract

Interview

Find Martin Kleppmann at

Similar Talks

Tracks

Covering innovative topics

Monday, 7 March

Tuesday, 8 March

Wednesday, 9 March

Conference for Professional Software Developers

Follow QCon

Contact

Menu

QCons around the World