You are viewing content from a past/completed QCon -


Better Resilience Adoption through UX

Too often, attempts to bring resilience engineering to an organization fall flat. Perhaps there’s some initial interest, but that wavers under the crushing weight of JIRA queues and sprint reviews. The tools are there but no one’s using them.

This session will go over three case studies where teams achieved success (and a few that didn't!) by focusing on the human element of engineering tooling. In each one, we’ll look at a specific UX technique that team employed to put their company on a path to resilience.

What is the work you're doing today?

I'm working for a small startup. I do UI engineering, and people think that I write JavaScript and CSS. That's the least of what I do. Partially because I work across the stack, partially because in my opinion, I work with what matters, the people. Writing code is the easy part, getting a button aligned is hard, but making sure that the button does what people expect, that the button is where people need it to be is so much more important. That's what I do. That's one of the things that's important to me.

What are the goals for your talk?

I want to talk about real world case studies. I don't want to talk about abstract concepts. I'll talk about real world case studies where a company's software got better. Usually through resiliency. And I don't mean using Kubernetes and other buzzwords, AI. They built a tool that wanted to be used, and engineers around the company said, oh, this is easy. I click the button and I get what I want. And now my software is better. One of the cases I want to talk about is just through monitoring and alerting, being able to see you get a better idea of what's going on and be alerted when things may start going haywire. It doesn't matter how powerful your metrics are if no one can access them.

What key takeaways would you like people to leave the talk with?

I think the most important thing is people walk away with is that people are the most important part of software. If the software is unusable, no one enjoys it. I want people to build tools, in my case resilience tools that are useful and make other people's days better and do that by talking to people.


Randall Koutnik

UI Engineer

Randall's career can be politely summed up as "interesting". He's worked at everything from tiny startups to Netflix to teaching introductory programming at a bootcamp. He wrote a book on RxJS, which didn't impress his cats much.You can find his words in written form at...

Read more
Find Randall Koutnik at:


St James, 4th flr.


Chaos and Resilience: Architecting for Success


Interview AvailableResilient SystemsMonitoringMonitoring Tools


Slides are not available


From the same track

SESSION + Live Q&A Interview Available

Preparing for the Unexpected

Convincing engineers to be on-call isn’t always straightforward. In 2019 the Customer Products group at the Financial Times set out to make their out of hours support process more sustainable after losing a number of people from their on-call team.In this talk you’ll discover how to...

Samuel Parkinson

Principal Engineer @FinancialTimes

SESSION + Live Q&A Incident Management

Growing Resilience: Serving Half a Billion Users Monthly at Condé Nast

Serving over half a billion monthly customers while keeping service availability high is a monumental task. Condé Nast operates in nearly 40 countries and is better known for it’s portfolio of household brands such as Vogue, Wired, Vanity Fair, The New Yorker. Our globally distributed...

Crystal Hirschorn

VP Engineering, Global Strategy & Operations @CondeNast

SESSION + Live Q&A Incident Management

How Many Is Too Much? Exploring Costs of Coordination During Outages

Service outages can attract a lot of attention from a wide range of participants - particularly when the service is for a business critical function. These ‘stakeholders’ represent multiple roles with different experience, responsibilities, expertise and knowledge about how the system...

Laura Maguire

Cognitive Systems Engineer & Researcher

SESSION + Live Q&A Incident Management

Rethinking How the Industry Approaches Chaos Engineering:

In order to determine and envision how to achieve reliability and resilience that drive our businesses forward, organizations must be able to look back at past blunders unobscured by hindsight bias. Resilient organizations don’t take past successes as a reason for confidence. Instead, they...

Nora Jones

Senior Developer/ Engineer

View full Schedule