Presentation: "Resilient Response In Complex Systems"

Track: Keynote / Time: Friday 09:00 - 10:00 / Location: To be announced

Complex systems fail, and they don't always fail in expected ways. Recovering from, learning from, and anticipating failure in complex systems requires the efficient cooperation of response teams in sometimes disorienting and escalating scenarios. There are a number of pitfalls that engineers can fall into while troubleshooting production systems under these conditions, but there are also ways to side-step them gracefully. This talk will cover those, as well as compare and contrast web operations at scale with the practices and culture of High Reliability Organizations such as aviation and nuclear power systems.

John Allspaw, Engineering Culture Hacker

John has worked in systems operations for over fourteen years in biotech, government and online media. He started out tuning parallel clusters running vehicle crash simulations for the U.S. government, and then moved on to the Internet in 1997. He built the backing infrastructures at Salon, InfoWorld, Friendster, and Flickr. He is now SVP of Tech Operations at Etsy, and is the author of The Art of Capacity Planning and Web Operations published by O'Reilly.