Understanding Progressive Collapse: How To Avoid A Cascading Failure

Abstract

Small things going wrong can quickly snowball. The cascading failure is often a nightmare scenario for any system. An initial problem, which in isolation seems like such a minor problem, can kick off a chain reaction of ever-increasing failures, potentially leading to catastrophic results.

When a failure of a single component results in the failure of other connected elements, this is known as a progressive collapse. In this talk, Sam Newman looks at this phenomenon in more detail, and he'll examine how it has manifested in major disasters. Based on lessons learned from other industries, Sam will share three key techniques that can be used to mitigate against the progressive collapse occurring in your own system.

This talk will help you understand how to architect your systems in such a way that small failures stay small.


Speaker

Sam Newman

Microservice, Cloud, CI/CD Expert, Author of "Building Microservices" and "Monolith to Microservices", 20+ Years Experience as a Developer

Sam Newman is an independent consultant who loves solving problems with technology. Focusing primarily in the areas of cloud, microservice architecture and continuous delivery, Sam works with companies big and small all over the world. He is also an experienced conference speaker, and author of the O’Reilly books Monolith To Microservices, Building Microservices, and the forthcoming Building Resilient Distributed Systems.

Read more
Find Sam Newman at:

From the same track

Session resilience

How to Find Resilience Bugs in Systems that Don't Exist

Wednesday Mar 18 / 10:35AM GMT

Building correct distributed systems takes thinking outside the box, and the fastest way to do that is to think inside a different box. One different box is "formal methods", the discipline of mathematically verifying software and systems.

Speaker image - Hillel Wayne

Hillel Wayne

Author of "Logic for Programmers" and "Learn TLA+", Thought Leader in the Space of Empirical Software Engineering

Session

Spritely: Infrastructure for the Future of the Internet

Wednesday Mar 18 / 11:45AM GMT

Let's take back the internet! Learn about Spritely's work to re-decentralize the net with new foundational technologies that put users in control.

Speaker image - Christine  Lemmer-Webber

Christine Lemmer-Webber

Executive Director @Spritely Institute, Co-Author of ActivityPub

Speaker image - David Thompson

David Thompson

CTO @Spritely Institute,

Session

Maintaining Data Integrity During Regional Outages

Wednesday Mar 18 / 02:45PM GMT

Details coming soon.

Session

Migrating Legacy Monoliths to Resilient Microservices Without Downtime

Wednesday Mar 18 / 03:55PM GMT

Details coming soon.