Presentation: Develop Hundreds of Kubernetes Services at Scale With Airbnb
Share this on:
This presentation is now available to view on InfoQ.com
Watch video with transcriptWhat You’ll Learn
- Find out how Airbnb uses Kubernetes.
- Hear how Kubernetes can be used to deploy configurations at scale.
- Learn how to integrate Kubernetes into a CI/CD chain.
Abstract
You've already made the plunge to move to Kubernetes, and you feel pretty good about that. But why does it feel like it requires expert-level Kubernetes knowledge for engineers to get anything done?
This talk will identify key problems that make out-of-the-box Kubernetes less friendly to developers, and strategies for addressing them, based on Airbnb’s experience empowering one thousand engineers to develop hundreds of Kubernetes services at scale. This talk will focus primarily on four problem areas:
- Configuration: abstracting away kubernetse configration, generated services
- Lifecycle: Versioning and refactoring configuration
- Tooling: Creating and distributing opinionated kubectl and plugins
- CI/CD: build and deploy process, and validating configuration
Tell me a bit about your work at Airbnb.
Our biggest problem right now is trying to scale both our engineering productivity as well as the availability of our site. It involves breaking apart a giant application into services, and applying the best practices available. Previously we had all these bare metal instances on AWS configured with Chef. We're moving them to a more scalable SOA solution that's backed by Kubernetes, and there were some limits we hit with our old infrastructure around service discovery and stuff like that. I would consider us a fairly early adopter, and we did run into some issues. What I'm working on is solving these issues for our engineers, and more broadly for the community.
Tell me a bit about the goals for the talk.
The goals of the talk are to go over broad level things that we think you should watch out for as well as very specific things. If there are specific things we got stuck on, I'll just throw it out there. There are lots of examples about that like certain ways we annotate our pods, stuff like that. I'll try to get low level when I can, when I don't think it's worth speaking in vague terms, but also address big ideas like, what happens when we store infrastructure this way or this way. What are the problems when everything is configured through a console, or as code. When we configure as code we get a bunch of validation and continuous integration for free. Also, it's really powerful for us as being able to refactor all of our projects. One thing I want to dive into is our refactoring tool and how important that has been for all kinds of migrations. Many companies have a lot of overhead, updating security, patches, software and hardware reaching end-of-life. All migrations, even big ones, can be automated in some way. We have a Kubernetes cron job script that can update every single service. We have over 500 services now. It gets to a point where you can't do this manually or if you're doing it manually you're not doing it on a very good time frame.
Who's the audience that you're talking to?
I believe the talk is useful both to people using Kubernetes, and for infrastructure engineers trying to empower other people. One of the other ideas I'd like to drive is that it’s very important to you to automate your own workflow. I don't want this to be about automating things for people who don't understand kubernetes or for people who don't want to be bothered. A lot of the automation that we do is stuff that I personally use. It's abstractions that are useful for everyone.
What do you want someone who comes to talk to walk away with?
I want them to walk away with a deeper understanding of how Kubernetes works at scale, and some of the scaling problems you hit. But also not to be afraid because these problems are solvable and there are many solutions out there. Just a deeper understanding of how they might go about integrating Kubernetes with their infrastructure. It's not always straightforward. Be excited by Kubernetes, have a plan for integrating it and then know these gotchas ahead of time.
Can you give me an example of when to scalability use cases you might talk about?
We have a bunch of different use cases, every part of our website that you might visit is powered by a Kubernetes service. If you click to make a reservation or the exchange of money is also done by a service. We also have a bunch of very data intensive machine learning jobs, and those are also done in Kubernetes. There are different use cases, payments, cron jobs that are working with banks, services that are making requests and have an API layer, and then these machine learning jobs.
One of the problems we had was we wanted to apply our Kubernetes configuration, and also all these other earlier configurations, AWS configuration. How do you deploy your service alongside all your other configuration. And other configuration such as project ownership, metadata, and documentation. All of these things need to be updated and applied somehow. Another scaling issue we hit was how do you optimize builds for a huge java monorepo, and there’s some interesting multi-stage build optimizations you can make.
Last Year's Tracks
Monday, 2 March
-
Modern CS in the Real World
Head back to academia to solve today's problems in software engineering.
-
Next Generation Microservices: Building Distributed Systems the Right Way
Microservice-based applications are everywhere, but well-built distributed systems are not so common. Early adopters of microservices share their insights on how to design systems the right way.
-
Streaming Data Architectures
Today's systems process huge volumes of continuously changing data. Hear how the innovators in this space are designing systems and leveraging modern data stream processing platforms.
-
Driving Full Cycle Engineering Teams at Every Level
"Full cycle developers" is not just another catch phrase; it's about engineers taking ownership and delivering value, and doing so with the support of their entire organisation. Learn more from the pioneers.
-
When Things Go Wrong: GDPR, Ethics, & Politics
Privacy, confidentiality, safety and security: learning from the frontlines, from both good and bad experiences
-
JavaScript: Pushing the Client Beyond the Browser
JavaScript is not just the language of the web. Join this track to learn how the innovators are pushing the boundaries of this classic language and ecosystem
Tuesday, 3 March
-
Modern Compilation Targets
Learn about the innovation happening in the compilation target space. WebAssembly is only the tip of the iceberg.
-
Architectures You've Always Wondered About
Hard-earned lessons from the names you know on scalability, reliability, security, and performance.
-
The Future of the API: REST, gRPC, GraphQL and More
The humble web-based API is evolving. This track provides the what, how, and why of future APIs.
-
Building High Performing Teams
There are many discussions outlining the secret sauce of high-performing teams. Learn how to balance the essential ingredients of high performing teams such as trust and delegation, as well as recognising the pitfalls and problems that will ruin any recipe.
-
Machine Learning: The Latest Innovations
AI and machine learning is more approachable than ever. Discover how ML, deep learning, and other modern approaches are being used in practice.
-
Bare Knuckle Performance
Crushing latency and getting the most out of your hardware.
Wednesday, 4 March
-
Evolving Java
JVM futures, JIT directions and improvements to the runtimes stack is the theme of this year’s JVM track.
-
Growing Unicorns in the EU: Building, Leading and Scaling Financial Tech Start Ups
Learn how EU FinTech innovators have designed, built, and led both their technologies and organisations.
-
Kubernetes and Cloud Architectures
Learn about cloud native architectural approaches from the leading industry experts who have operated Kubernetes and FaaS at scale, and explore the associated modern DevOps practices.
-
Chaos and Resilience: Architecting for Success
Making systems resilient involves people and tech. Learn about strategies being used, from cognitive systems engineering to chaos engineering.
-
Leading Distributed Teams
Remote and distributed working are increasing in popularity, but many organisations underestimate the leadership challenges. Learn from those who are doing this effectively.
-
Scaling Security, from Device to Cloud
Implementing effective security is vitally important, regardless of where you are deploying software applications