You are viewing content from a past/completed QCon

Presentation: Develop Hundreds of Kubernetes Services at Scale With Airbnb

Track: DevOps & DevEx: Remove Friction, Ship Code, Add Value

Location: Fleming, 3rd flr.

Duration: 4:10pm - 5:00pm

Day of week: Tuesday

Share this on:

This presentation is now available to view on

Watch video with transcript

What You’ll Learn

  1. Find out how Airbnb uses Kubernetes.
  2. Hear how Kubernetes can be used to deploy configurations at scale.
  3. Learn how to integrate Kubernetes into a CI/CD chain.


You've already made the plunge to move to Kubernetes, and you feel pretty good about that. But why does it feel like it requires expert-level Kubernetes knowledge for engineers to get anything done?

 This talk will identify key problems that make out-of-the-box Kubernetes less friendly to developers, and strategies for addressing them, based on Airbnb’s experience empowering one thousand engineers to develop hundreds of Kubernetes services at scale.   This talk will focus primarily on four problem areas:   

  1. Configuration: abstracting away kubernetse configration, generated services
  2. Lifecycle: Versioning and refactoring configuration
  3. Tooling: Creating and distributing opinionated kubectl and plugins
  4. CI/CD: build and deploy process, and validating configuration 

Tell me a bit about your work at Airbnb.


Our biggest problem right now is trying to scale both our engineering productivity as well as the availability of our site. It involves breaking apart a giant application into services, and applying the best practices available. Previously we had all these bare metal instances on AWS configured with Chef. We're moving them to a more scalable SOA solution that's backed by Kubernetes, and there were some limits we hit with our old infrastructure around service discovery and stuff like that. I would consider us a fairly early adopter, and we did run into some issues. What I'm working on is solving these issues for our engineers, and more broadly for the community.


Tell me a bit about the goals for the talk.


The goals of the talk are to go over broad level things that we think you should watch out for as well as very specific things. If there are specific things we got stuck on, I'll just throw it out there. There are lots of examples about that like certain ways we annotate our pods, stuff like that. I'll try to get low level when I can, when I don't think it's worth speaking in vague terms, but also address big ideas like, what happens when we store infrastructure this way or this way. What are the problems when everything is configured through a console, or as code. When we configure as code we get a bunch of validation and continuous integration for free. Also, it's really powerful for us as being able to refactor all of our projects. One thing I want to dive into is our refactoring tool and how important that has been for all kinds of migrations. Many companies have a lot of overhead, updating security, patches, software and hardware reaching end-of-life. All migrations, even big ones, can be automated in some way. We have a Kubernetes cron job script that can update every single service. We have over 500 services now. It gets to a point where you can't do this manually or if you're doing it manually you're not doing it on a very good time frame.


Who's the audience that you're talking to?


I believe the talk is useful both to people using Kubernetes, and for infrastructure engineers trying to empower other people. One of the other ideas I'd like to drive is that it’s very important to you to automate your own workflow. I don't want this to be about automating things for people who don't understand kubernetes or for people who don't want to be bothered. A lot of the automation that we do is stuff that I personally use. It's abstractions that are useful for everyone.


What do you want someone who comes to talk to walk away with?


I want them to walk away with a deeper understanding of how Kubernetes works at scale, and some of the scaling problems you hit. But also not to be afraid because these problems are solvable and there are many solutions out there. Just a deeper understanding of how they might go about integrating Kubernetes with their infrastructure. It's not always straightforward. Be excited by Kubernetes, have a plan for integrating it and then know these gotchas ahead of time.


Can you give me an example of when to scalability use cases you might talk about?


We have a bunch of different use cases, every part of our website that you might visit is powered by a Kubernetes service. If you click to make a reservation or the exchange of money is also done by a service. We also have a bunch of very data intensive machine learning jobs, and those are also done in Kubernetes. There are different use cases, payments, cron jobs that are working with banks, services that are making requests and have an API layer, and then these machine learning jobs.

One of the problems we had was we wanted to apply our Kubernetes configuration, and also all these other earlier configurations, AWS configuration. How do you deploy your service alongside all your other configuration. And other configuration such as project ownership, metadata, and documentation. All of these things need to be updated and applied somehow. Another scaling issue we hit was how do you optimize builds for a huge java monorepo, and there’s some interesting multi-stage build optimizations you can make.

Speaker: Melanie Cebula

Software Engineer @airbnb

Melanie Cebula is an infrastructure engineer at Airbnb, where she works on service orchestration. She loves building systems that make it easy for developers to create and operate their own services securely and reliably.

Find Melanie Cebula at

Similar Talks

Monitoring All the Things: Keeping Track of a Mixed Estate


Principal Engineer Operations and Reliability Programme @FT

Luke Blaney

3 Disciplines for Leading a Distributed Agile Organization


Distributed Coach/Mentor & Community Cultivator

Mark Kilby

A Brief History of the Future of the API


Co-Author of gRPC for WCF Developers and Creator @VisualRecode

Mark Rendle

Preparing for the Unexpected


Principal Engineer @FinancialTimes

Samuel Parkinson


  • Architectures You've Always Wondered About

    Hard-earned lessons from the names you know on scalability, reliability, security, and performance.

  • Machine Learning: The Latest Innovations

    AI and machine learning is more approachable than ever. Discover how ML, deep learning, and other modern approaches are being used in practice.

  • Kubernetes and Cloud Architectures

    Learn about cloud native architectural approaches from the leading industry experts who have operated Kubernetes and FaaS at scale, and explore the associated modern DevOps practices.

  • Evolving Java

    JVM futures, JIT directions and improvements to the runtimes stack is the theme of this year’s JVM track.

  • Next Generation Microservices: Building Distributed Systems the Right Way

    Microservice-based applications are everywhere, but well-built distributed systems are not so common. Early adopters of microservices share their insights on how to design systems the right way.

  • Chaos and Resilience: Architecting for Success

    Making systems resilient involves people and tech. Learn about strategies being used, from cognitive systems engineering to chaos engineering.

  • The Future of the API: REST, gRPC, GraphQL and More

    The humble web-based API is evolving. This track provides the what, how, and why of future APIs.

  • Streaming Data Architectures

    Today's systems move huge volumes of data. Hear how the innovators in this space are designing systems and leveraging modern data stream processing platforms.

  • Modern Compilation Targets

    Learn about the innovation happening in the compilation target space. WebAssembly is only the tip of the iceberg.

  • Modern CS in the Real World

    Head back to academia to solve today's problems in software engineering.

  • Bare Knuckle Performance

    Crushing latency and getting the most out of your hardware.

  • Leading Distributed Teams

    Remote and distributed working are increasing in popularity, but many organisations underestimate the leadership challenges. Learn from those who are doing this effectively.

  • Driving Full Cycle Engineering Teams at Every Level

    "Full cycle developers" is not just another catch phrase; it's about engineers taking ownership and delivering value, and doing so with the support of their entire organisation. Learn more from the pioneers.

  • JavaScript: Pushing the Client Beyond the Browser

    JavaScript is not just the language of the web. Join this track to learn how the innovators are pushing the boundaries of this classic language and ecosystem

  • When Things Go Wrong: GDPR, Ethics, & Politics

    Privacy, confidentiality, safety and security: learning from the frontlines, from both good and bad experiences

  • Growing Unicorns in the EU: Building, Leading and Scaling Financial Tech Start Ups

    Learn how EU FinTech innovators have designed, built, and led both their technologies and organisations.

  • Building High Performing Teams

    There are many discussions outlining the secret sauce of high-performing teams. Learn how to balance the essential ingredients of high performing teams such as trust and delegation, as well as recognising the pitfalls and problems that will ruin any recipe.

  • Scaling Security, from Device to Cloud

    Implementing effective security is vitally important, regardless of where you are deploying software applications