Designing for Failure

Past Presentations

Monkeys in Lab Coats: Applying failure testing research @Netflix

Industry and academia need each other. Far from the tire fires of production, university researchers have the time to ask big questions. Sometimes they get lucky and obtain answers that change how we think about large-scale systems! But detached from real world constraints, systems research in...

Kolton Andrus Founder of Gremlin Inc, former Netflix
Peter Alvaro Computer Science Assistant Professor @UniversityofCalifornia
Building Reliability In An Unreliable World

GameSparks is a globally-distributed Backend-as-a-Service platform that serves tens of billions of API requests per month for hundreds of live games which have tens of millions of active users, hundreds of thousands of whom are concurrently connected at any one time. All of our players connect...

Greg Murphy Chief Architect, Infrastructure & Operations @GameSparks
Challenging Perceptions of NHS IT

What are your perceptions of NHS IT? Not great? Well the truth is very different to what you might expect. There is something of a technical renaissance going on in parts of the NHS where things are being done in a modern way, learning from past experiences. We'll look at one example system...

Edward Hiley Principal Engineer @NHSDigital
Daniel Rathbone Technical Director @InfinityWorks
Building and Trusting a Cloud Bank

"Fail fast and fail often" - not only does the Silicon Valley mantra speak volumes on the relentless pursuit of innovation, it also highlights technology's power of unpredictability. But when creating a bank from scratch, the art is in combining pace of change and stability. So when your boss...

Greg Hawkins Former Chief Technology Officer @starlingbank
Best Practices Building Resilient Systems

Architecting for Failure covers the challenges (both technical and organizational) of constantly improving service delivery of a growing global company with a 24x7x365 service redundancy requirement. The talk focuses on best practices and lessons learned in building resilient systems. Topics...

Pablo Jensen CTO @Sportradar
Scaling Uber's Elasticsearch Clusters

Uber's Marketplace is the algorithmic brain behind Uber's ride-sharing services, and the brain needs immense amount of real-time data to make timely and sound decisions. Uber's Marketplace Intelligence team has been using Elasticsearch as a real-time OLAP database to serve thousands of internal...

Danny Yuan Real-time Streaming Lead @Uber


Danny Yuan Real-time Streaming Lead @Uber

Scaling Uber's Elasticsearch Clusters

How you you describe the persona and level of the target audience?

The target audience are software engineers or SREs who are interested in scaling out Elasticsearch for OLAP workload. The audience should have basic understanding of Elasticsearch and OLAP.

Read Full Interview
Jonas Bonér Founder & CTO @Lightbend / Creator of Akka

How Events Are Reshaping Modern Systems

How you you describe the persona and level of the target audience?

My talk is for programmers and architects (from beginners to experienced) that are interested in and intrigued by event-driven systems and event-driven architecture.

Read Full Interview
Michael Maibaum Chief Architect @SkyBet

Pragmatic Resiliency: Super 6 & Sky Bet Evolution

What is your talk about?

You go to a lot of conferences and you hear people from Google or Netflix talking about reactive architectures or the Simian army or whatever, and it all feels quite unattainable for a lot of people. It's like this big complicated thing, there is not much like those systems. And Sky Bet has changed quite a lot over the last few years....

Read Full Interview
Want to keep in touch with more QCon London 2021 announcements?