Architecting for Resilience

As recent service disruptions and cloud outages have reminded us all, resilience forms a critical aspect of software systems.

But how should we approach building it into our architectures?

From leveraging progressive collapse to avoid cascading failures to finding resilience bugs in systems that don't exist, this track will give attendees concrete steps for building more resilient systems in an increasingly distributed world.


From this track

Session resilience

How to Find Resilience Bugs in Systems that Don't Exist

Wednesday Mar 18 / 10:35AM GMT

Building correct distributed systems takes thinking outside the box, and the fastest way to do that is to think inside a different box. One different box is "formal methods", the discipline of mathematically verifying software and systems.

Speaker image - Hillel Wayne

Hillel Wayne

Author of "Logic for Programmers" and "Learn TLA+"

Session decentralized

Spritely: Infrastructure for the Future of the Internet

Wednesday Mar 18 / 11:45AM GMT

Let's take back the internet! Learn about Spritely's work to re-decentralize the net with new foundational technologies that put users in control.

Speaker image - Christine  Lemmer-Webber

Christine Lemmer-Webber

Executive Director @Spritely Institute, Co-Author of ActivityPub

Speaker image - David Thompson

David Thompson

CTO @Spritely Institute

Session architecture

Understanding Progressive Collapse: How To Avoid A Cascading Failure

Wednesday Mar 18 / 01:35PM GMT

Small things going wrong can quickly snowball. The cascading failure is often a nightmare scenario for any system. An initial problem, which in isolation seems like such a minor problem, can kick off a chain reaction of ever-increasing failures, potentially leading to catastrophic results.

Speaker image - Sam Newman

Sam Newman

Microservice, Cloud, CI/CD Expert, Author of "Building Microservices" and "Monolith to Microservices", 20+ Years Experience as a Developer

Session

Keeping the Nation On-Air: How We Think About Resilience at the BBC

Wednesday Mar 18 / 02:45PM GMT

At the heart of the BBC is delivering value to all, serving audiences across the UK and the world on TV, radio, and online with trusted and impartial news and high-quality British content.

Speaker image - Tom Everest

Tom Everest

Head of Department for Architecture and Supply Chain @BBC

Session

Shielding the Core: Architecting Resilience with Multi-Layer Defenses

Wednesday Mar 18 / 03:55PM GMT

High-demand events can cause sudden traffic spikes that overwhelm even well-designed systems. In ticketing platforms, millions of users — alongside increasingly sophisticated automated agents — may arrive simultaneously, placing extreme pressure on backend services.

Speaker image - Anderson Parra

Anderson Parra

Staff Software Engineer @SeatGeek

Track Host

Jonathan Magen

Computer Scientist, Distributed Systems Specialist, 20+ Years in Software Development

Jonathan is a computer scientist who's been enthusiastically practicing for over two decades. His primary areas of inquiry include distributed systems, security & compliance automation, and figuring out how to ease building big systems well. Jonathan lives and works in Philadelphia. 

Read more