warning icon QCon London 2021 has been canceled. See our current virtual and in-person events.
You are viewing content from a past/completed QCon -

Presentation: Complex Event Flows in Distributed Systems

Track: Operationalizing Microservices: Design, Deliver, Operate

Location: Fleming, 3rd flr.

Duration: 10:35am - 11:25am

Day of week: Tuesday

Slides: Download Slides

This presentation is now available to view on InfoQ.com

Watch video with transcript

What You’ll Learn


  1. Hear about a distributed edge computing system that does not rely on any central system to function.

  2. Find out what are the principles and the benefits underlying such a system.


Event-driven architectures enable nicely decoupled microservices and are fundamental for decentralized data management. However, using peer-to-peer event chains to implement complex end-to-end logic crossing service boundaries can accidentally increase coupling. Extracting such business logic into dedicated services reduces coupling and allows to keep sight of larger-scale flows - without violating bounded contexts, harming service autonomy or introducing god services. Service boundaries get clearer and service APIs get smarter by focusing on their potentially long running nature. I will demonstrate how the new generation of lightweight and highly-scalable state machines ease the implementation of long running services. Based on my real-life experiences, I will share how to handle complex logic and flows which require proper reactions on failures, timeouts and compensating actions and I provide guidance backed by code examples to illustrate alternative approaches.


What does the title of your presentation mean?


The title is basically the easiest possible use case that I can use to illustrate why we're doing the complicated things that we are doing. It's because in a factory people still use paper, and in thousands of factories across Germany people on the shop floor do not use computers because computers are not reliable enough. Telling people what to do, sending them work orders is done on paper and recording the results is done with a pen on paper. The reason is that the paper works without Wi-Fi, without electricity, and it will remember what you wrote on it. The same is not true for normal apps that connect to a server.


You're basically talking about a queue at each place, and you pass a message to it. It loads it in the queue. They process it and they move it to the next stage.


Something like that. Also, you want to record all sorts of things, when people started and stopped, whether they have checked certain checkmarks or put in measurement values that you do for for quality assurance in a factory. Whatever processes you do when you interact as a human with all the machines that are there.


What's the focus of your talk?


The main point is that we cannot go to the cloud because the cloud could be unavailable, unreachable. This is why all the processing of the business logic needs to run in the factory on the shop floor. It can't be anywhere else. Otherwise it wouldn't be as reliable as paper. This is why the business logic needs to be completely and fully distributed across the edge devices. It means that each edge device may have a front-end but it sure does have the back-end, and there is no central back-end. That is basically the main point. This yields a very distributed system with many components, and we want to make that manageable, which is why all the principles that we have for cutting a problem into microservices apply here as well. Only the microservices here are not operated on EC2 instances. A microservice may run across five tablets, and it has a clear well described responsibility. It generates an event stream. It's fully event sourced, and the events are what is shared across the devices. There I will go into some depth of how it works but probably it's not possible to tell all the business secrets that we might still want to keep secret. But the point is that you can distribute it like this and it behaves exactly like paper. That's also where the title makes sense, and the processes that you execute at each location can run based on the knowledge that they have acquired even if they are now cut off from network , they can continue to work from that knowledge and record new facts and then new facts are synchronized between the devices.


Are you talking about a framework that pushes a mesh to an edge environment to a factory floor, are you talking about a programming paradigm?


We use certain technologies. For example, we use IPFS, the interplanetary file system and its library to do the peer to peer communication between the devices. There's no need to implement that yourself. Then I talk about a programming paradigm that is independent of language. How to make this work. Because you cannot have a central message queue. You can't have a broker. You don't have anything that puts things into sequence, so you need to have one event log per tablet for example, and then you need to merge these logs and process them, and you need a suitable programming model which is essentially a state monad on top of the event log and you replay it every time you get new information. And information might come in from the past so it's only eventually consistent.


Who's the main focus of this talk?


That is an extremely interesting question. I think everybody should be. I want the industry as a whole to start thinking about this because what we have come to agree upon is we buy certain services from the large cloud providers and that's how we structure software which means we have all the computation of those algorithms, business logic in very few hands, and they run centralized, even though AWS, Google and Microsoft spread their computing centers across the globe. It's all centralized in the sense of the business entities. So if we want to make the Internet of Things really an Internet of Things where things communicate with one another there is no need for someone like Google to mediate that exchange. If we go to the edge and we make direct communication happen then we can solve problems that benefit many use-cases. For example, my brother had his computer doing the Photoshopping of his photos. He has his NAS device and he has his TV thingy, and in order to synchronize the photos between the three he needs two different clouds which is completely weird. It's in the same room with the same Wi-Fi. I think we need to go in that direction.


What are the actionable takeaways?


That's a difficult part because I would like people to use a framework that I would be able to offer them. But unfortunately the implementation that I have designed—and that's what we're using at Actyx—is not open source. So I cannot point people to an edge Kubernetes variant, that's unfortunately not how it works. It's more talking about the principle. We can make this work, we can have devices that exchange knowledge and that incorporate knowledge when they get it. We don't need any central database, and I can give people enough of the recipe so they can start building such a framework themselves if they want. Unfortunately, it's not yet ready made.


What do you want someone to walk away from the talk with?


Inspired that this really decentralized computing is actually possible, that it's viable. Because if we can make it work in a factory and it actually does work and people pay money for it, it proves that there is something here.

Speaker: Bernd Ruecker

Co-founder and chief technologist @Camunda

I have been the in the software development field for more than 15 years, automating highly scalable workflows at global companies including T-Mobile, Lufthansa and Zalando and contributing to various open source workflow engines. I’m Co-Founder and Chief Technologist of Camunda – an open source software company reinventing workflow automation. Along with my Co-Founder, I wrote "Real-Life BPMN," a popular book about workflow modeling and automation. I regularly speak at international conferences and write for various magazines, focusing on new workflow automation paradigms that fit into modern architectures around distributed systems, microservices, domain-driven design, event-driven architecture and reactive systems.

Find Bernd Ruecker at

Last Year's Tracks