Abstract

Small things going wrong can quickly snowball. The cascading failure is often a nightmare scenario for any system. An initial problem, which in isolation seems like such a minor problem, can kick off a chain reaction of ever-increasing failures, potentially leading to catastrophic results.

When a failure of a single component results in the failure of other connected elements, this is known as a progressive collapse. In this talk, Sam Newman looks at this phenomenon in more detail, and he'll examine how it has manifested in major disasters. Based on lessons learned from other industries, Sam will share three key techniques that can be used to mitigate against the progressive collapse occurring in your own system.

This talk will help you understand how to architect your systems in such a way that small failures stay small.

Interview:

What is your session about, and why is it important for senior software developers?

My session explores what happens when a small initial problem causes a giant catastrophe. In the context of buildings, this is called Progress Collapse. In my talk, I look at what happens when a building suffers a progressive collapse, how these can be mitigated, and what parallels we can draw deal with the cascading failures we see in distributed systems.

Why is it critical for software leaders to focus on this topic right now, as we head into 2026?

My session is about how disparate parts of a system interact, especially in the context of increasingly distributed systems. How we write code may have changed a lot over the last couple of years, but the fundamentals of system design, and the challenges of distributed systems still remain.

What are the common challenges developers and architects face in this area?

When something goes wrong, they tend to look for one obvious cause, blame that and move on, without looking at wider systemic issues
Too much focus on stopping things breaking, and not enough time spent on understanding how the system can continue to work when something does break

What's one thing you hope attendees will implement immediately after your talk?

Stop looking for single causes of failure!

What makes QCon stand out as a conference for senior software professionals?

The curated tracks are what helps QCon stand apart. It means you get a lot less clash between tracks, but also it means that each individual track ends up having something for everyone.

Speaker

Sam Newman

Microservice, Cloud, CI/CD Expert, Author of "Building Microservices" and "Monolith to Microservices", 20+ Years Experience as a Developer

Sam Newman is an independent consultant who loves solving problems with technology. Focusing primarily in the areas of cloud, microservice architecture and continuous delivery, Sam works with companies big and small all over the world. He is also an experienced conference speaker, and author of the O’Reilly books Monolith To Microservices, Building Microservices, and the forthcoming Building Resilient Distributed Systems.

Understanding Progressive Collapse: How To Avoid A Cascading Failure

Abstract

Interview:

What is your session about, and why is it important for senior software developers?

Why is it critical for software leaders to focus on this topic right now, as we head into 2026?

What are the common challenges developers and architects face in this area?

What's one thing you hope attendees will implement immediately after your talk?

What makes QCon stand out as a conference for senior software professionals?

Speaker

Sam Newman

Find Sam Newman at:

Speaker

Sam Newman

Date

Location

Track

Topics

Share

From the same track

How to Find Resilience Bugs in Systems that Don't Exist

Spritely: Infrastructure for the Future of the Internet

Maintaining Data Integrity During Regional Outages

Migrating Legacy Monoliths to Resilient Microservices Without Downtime

Follow QCon

Contact

Menu

Conferences around the World