You are viewing content from a past/completed QCon -

Presentation: Develop Hundreds of Kubernetes Services at Scale With Airbnb

Track: DevOps & DevEx: Remove Friction, Ship Code, Add Value

Location: Fleming, 3rd flr.

Duration: 4:10pm - 5:00pm

Day of week:

Slides: Download Slides

This presentation is now available to view on

Watch video with transcript

What You’ll Learn

  1. Find out how Airbnb uses Kubernetes.
  2. Hear how Kubernetes can be used to deploy configurations at scale.
  3. Learn how to integrate Kubernetes into a CI/CD chain.


You've already made the plunge to move to Kubernetes, and you feel pretty good about that. But why does it feel like it requires expert-level Kubernetes knowledge for engineers to get anything done?

 This talk will identify key problems that make out-of-the-box Kubernetes less friendly to developers, and strategies for addressing them, based on Airbnb’s experience empowering one thousand engineers to develop hundreds of Kubernetes services at scale.   This talk will focus primarily on four problem areas:   

  1. Configuration: abstracting away kubernetse configration, generated services
  2. Lifecycle: Versioning and refactoring configuration
  3. Tooling: Creating and distributing opinionated kubectl and plugins
  4. CI/CD: build and deploy process, and validating configuration 

Tell me a bit about your work at Airbnb.


Our biggest problem right now is trying to scale both our engineering productivity as well as the availability of our site. It involves breaking apart a giant application into services, and applying the best practices available. Previously we had all these bare metal instances on AWS configured with Chef. We're moving them to a more scalable SOA solution that's backed by Kubernetes, and there were some limits we hit with our old infrastructure around service discovery and stuff like that. I would consider us a fairly early adopter, and we did run into some issues. What I'm working on is solving these issues for our engineers, and more broadly for the community.


Tell me a bit about the goals for the talk.


The goals of the talk are to go over broad level things that we think you should watch out for as well as very specific things. If there are specific things we got stuck on, I'll just throw it out there. There are lots of examples about that like certain ways we annotate our pods, stuff like that. I'll try to get low level when I can, when I don't think it's worth speaking in vague terms, but also address big ideas like, what happens when we store infrastructure this way or this way. What are the problems when everything is configured through a console, or as code. When we configure as code we get a bunch of validation and continuous integration for free. Also, it's really powerful for us as being able to refactor all of our projects. One thing I want to dive into is our refactoring tool and how important that has been for all kinds of migrations. Many companies have a lot of overhead, updating security, patches, software and hardware reaching end-of-life. All migrations, even big ones, can be automated in some way. We have a Kubernetes cron job script that can update every single service. We have over 500 services now. It gets to a point where you can't do this manually or if you're doing it manually you're not doing it on a very good time frame.


Who's the audience that you're talking to?


I believe the talk is useful both to people using Kubernetes, and for infrastructure engineers trying to empower other people. One of the other ideas I'd like to drive is that it’s very important to you to automate your own workflow. I don't want this to be about automating things for people who don't understand kubernetes or for people who don't want to be bothered. A lot of the automation that we do is stuff that I personally use. It's abstractions that are useful for everyone.


What do you want someone who comes to talk to walk away with?


I want them to walk away with a deeper understanding of how Kubernetes works at scale, and some of the scaling problems you hit. But also not to be afraid because these problems are solvable and there are many solutions out there. Just a deeper understanding of how they might go about integrating Kubernetes with their infrastructure. It's not always straightforward. Be excited by Kubernetes, have a plan for integrating it and then know these gotchas ahead of time.


Can you give me an example of when to scalability use cases you might talk about?


We have a bunch of different use cases, every part of our website that you might visit is powered by a Kubernetes service. If you click to make a reservation or the exchange of money is also done by a service. We also have a bunch of very data intensive machine learning jobs, and those are also done in Kubernetes. There are different use cases, payments, cron jobs that are working with banks, services that are making requests and have an API layer, and then these machine learning jobs.

One of the problems we had was we wanted to apply our Kubernetes configuration, and also all these other earlier configurations, AWS configuration. How do you deploy your service alongside all your other configuration. And other configuration such as project ownership, metadata, and documentation. All of these things need to be updated and applied somehow. Another scaling issue we hit was how do you optimize builds for a huge java monorepo, and there’s some interesting multi-stage build optimizations you can make.

Speaker: Melanie Cebula

Software Engineer @airbnb

Melanie Cebula is an infrastructure engineer at Airbnb, where she works on service orchestration. She loves building systems that make it easy for developers to create and operate their own services securely and reliably.

Find Melanie Cebula at


Discover some of the topics you will see at QCon London.

Architectures You've Always Wondered About

Hard-earned lessons from the names you know on scalability, reliability, security & performance.

Performance & Mechanical Sympathy

Wrangling computer architectures to achieve your performance goals.

Current Trends in Frontends

Emerging frameworks and tools for web & mobile frontends.

Developer Enablement - The Secret Weapon

How to supercharge developer productivity and happiness.

Innovations in ML Systems

Discover how to operationalize machine learning applications that are scalable, secure, interpretable.

Building and Evolving APIs

How would you design and implement your APIs if you were starting today?

Staff-Plus Engineer Path

What are the skills you need to develop if you want to stay on the technical track to technical lead, staff, or principal? What will it take to thrive in this type of role?

Debug, Analyze & Optimise... in Production!

Learn how to make the most out of your observability.

Resilient Architectures

Building systems that can handle failures.

Modern Java

New, shiny, and proven – making your Java life better.

Enterprise Blockchain

Using blockchain to deliver decentralized systems for the enterprise.

WebAssembly & Modern Systems Programming Languages

Writing efficient code with modern systems languages like Rust and WebAssembly

Next Generation Microservices: Building Distributed Systems the Right Way

Patterns for building, organizing, observing & operating microservices in the real world.

Modern Data Pipelines & DataMesh

Learn about emerging solutions for creating modern data pipelines for common data challenges.

Optimising for Speed & Flow

Discover processes, practices & organizational structures that can help us build & release faster.