Presentation: What if You Need Reliability Comparable to Paper?

Track: Operationalizing Microservices: Design, Deliver, Operate

Location: Fleming, 3rd flr.

Duration: 1:40pm - 2:30pm

Day of week: Tuesday

Share this on:

What You’ll Learn

  1. Hear about a distributed edge computing system that does not rely on any central system to function.

  2. Find out what are the principles and the benefits underlying such a system.

Abstract

In the manufacturing industry downtime is very expensive, therefore most small and midsize factories are still managed using paper-based processes. The problem space is perfectly suited for the microservices approach: well-defined and locally encapsulated responsibilities, collaboration and loose coupling between different links in the chain, rapid evolution of individual pieces for the purpose of optimising business outcomes. But how can we operate microservices such that they can deliver the resilience of paper? How can we leverage the locality of process data with high bandwidth and low latency communication to enable innovative improvements in the Internet of Things? 

This talk explores the radical approach of operating microservices in a peer-to-peer network on the factory shop-floor, using event sourcing as the only means of communication and observation. We discuss the consequences of going all in on availability and partition tolerance. We pay particular attention to eventual consistency and its impact on replacing nodes, upgrading services, and evolving event schemas. And we see how event sourcing can help understand the behaviour of such an uncompromisingly distributed system and enable powerful testing—both before and after an hitting an issue in production.

Question: 

What does the title of your presentation mean?

Answer: 

The title is basically the easiest possible use case that I can use to illustrate why we're doing the complicated things that we are doing. It's because in a factory people still use paper, and in thousands of factories across Germany people on the shop floor do not use computers because computers are not reliable enough. Telling people what to do, sending them work orders is done on paper and recording the results is done with a pen on paper. The reason is that the paper works without Wi-Fi, without electricity, and it will remember what you wrote on it. The same is not true for normal apps that connect to a server.

Question: 

You're basically talking about a queue at each place, and you pass a message to it. It loads it in the queue. They process it and they move it to the next stage.

Answer: 

Something like that. Also, you want to record all sorts of things, when people started and stopped, whether they have checked certain checkmarks or put in measurement values that you do for for quality assurance in a factory. Whatever processes you do when you interact as a human with all the machines that are there.

Question: 

What's the focus of your talk?

Answer: 

The main point is that we cannot go to the cloud because the cloud could be unavailable, unreachable. This is why all the processing of the business logic needs to run in the factory on the shop floor. It can't be anywhere else. Otherwise it wouldn't be as reliable as paper. This is why the business logic needs to be completely and fully distributed across the edge devices. It means that each edge device may have a front-end but it sure does have the back-end, and there is no central back-end. That is basically the main point. This yields a very distributed system with many components, and we want to make that manageable, which is why all the principles that we have for cutting a problem into microservices apply here as well. Only the microservices here are not operated on EC2 instances. A microservice may run across five tablets, and it has a clear well described responsibility. It generates an event stream. It's fully event sourced, and the events are what is shared across the devices. There I will go into some depth of how it works but probably it's not possible to tell all the business secrets that we might still want to keep secret. But the point is that you can distribute it like this and it behaves exactly like paper. That's also where the title makes sense, and the processes that you execute at each location can run based on the knowledge that they have acquired even if they are now cut off from network , they can continue to work from that knowledge and record new facts and then new facts are synchronized between the devices.

Question: 

Are you talking about a framework that pushes a mesh to an edge environment to a factory floor, are you talking about a programming paradigm?

Answer: 

We use certain technologies. For example, we use IPFS, the interplanetary file system and its library to do the peer to peer communication between the devices. There's no need to implement that yourself. Then I talk about a programming paradigm that is independent of language. How to make this work. Because you cannot have a central message queue. You can't have a broker. You don't have anything that puts things into sequence, so you need to have one event log per tablet for example, and then you need to merge these logs and process them, and you need a suitable programming model which is essentially a state monad on top of the event log and you replay it every time you get new information. And information might come in from the past so it's only eventually consistent.

Question: 

Who's the main focus of this talk?

Answer: 

That is an extremely interesting question. I think everybody should be. I want the industry as a whole to start thinking about this because what we have come to agree upon is we buy certain services from the large cloud providers and that's how we structure software which means we have all the computation of those algorithms, business logic in very few hands, and they run centralized, even though AWS, Google and Microsoft spread their computing centers across the globe. It's all centralized in the sense of the business entities. So if we want to make the Internet of Things really an Internet of Things where things communicate with one another there is no need for someone like Google to mediate that exchange. If we go to the edge and we make direct communication happen then we can solve problems that benefit many use-cases. For example, my brother had his computer doing the Photoshopping of his photos. He has his NAS device and he has his TV thingy, and in order to synchronize the photos between the three he needs two different clouds which is completely weird. It's in the same room with the same Wi-Fi. I think we need to go in that direction.

Question: 

What are the actionable takeaways?

Answer: 

That's a difficult part because I would like people to use a framework that I would be able to offer them. But unfortunately the implementation that I have designed—and that's what we're using at Actyx—is not open source. So I cannot point people to an edge Kubernetes variant, that's unfortunately not how it works. It's more talking about the principle. We can make this work, we can have devices that exchange knowledge and that incorporate knowledge when they get it. We don't need any central database, and I can give people enough of the recipe so they can start building such a framework themselves if they want. Unfortunately, it's not yet ready made.

Question: 

What do you want someone to walk away from the talk with?

Answer: 

Inspired that this really decentralized computing is actually possible, that it's viable. Because if we can make it work in a factory and it actually does work and people pay money for it, it proves that there is something here.

Speaker: Roland Kuhn

CTO and co-founder of @actyx, author of Reactive Design Patterns & co-author of the Reactive Manifesto

I am CTO and co-founder of Actyx, author of Reactive Design Patterns, a co-author of the Reactive Manifesto, co-teacher of the Coursera course “Principles of Reactive Programming”, and a passionate open-source hakker. Previously I led the Akka project at Lightbend. I also hold a Dr. rer. nat. in particle physics from TU München and have worked in the space industry for several years. I spend most of my life in the central European timezone.

Find Roland Kuhn at

Similar Talks

Tracks