Presentation: Scaling Facebook Live Videos to a Billion Users

Location:

Duration

Duration: 
11:50am - 12:40pm

Day of week:

Level:

Persona:

Key Takeaways

  • Understand how Facebook was able to go from prototype to a worldwide deployment of Live in four months.
  • Learn considerations for architecting large systems that have high degrees of unpredictability and elasticity needs.
  • Hear strategies to deal with bad connectivity and thundering herds that are common in systems like Facebook Live.

Abstract

Facebook Live is a live streaming platform available to all Facebook users (1 billion daily active people) from Facebook apps as well as an API. It enables citizen journalism, makes it easy to share everyday moments with your friends and allows celebrities to interact with their fans directly. Facebook Live has seen a tremendous growth in usage since being launched to the public.

Building a successful Live streaming platform requires live streams to have low latency and high quality. Broadcasting and viewing live streams has to work on a wide variety of devices across varying network capabilities. Interactivity during live streams is a key part of the overall experience. In this talk, we will talk about why Live matters, how Facebook Live was architected for these requirements and how it is setup to be flexible in adding features like multi-person live streaming.

Interview

Question: 
QCon: What is your role at Facebook today, and what are the types of problems you’re focused on?
Answer: 

I’m currently working on the backend for Facebook Live, Facebook Video, and Facebook Messenger. Most recently. I’ve been spending a lot of time on scaling the Facebook Live stack. Facebook Live let’s anybody broadcast across the world using just the camera in their pocket. What this means (in real terms) is the backend has to be able to scale to support billions of users, with little predictability. The scale we’re dealing with just keeps on increasing. So that’s really been my primary focus.

Question: 
QCon: Are there numbers you can share to give us an idea of the scale you’re talking about?
Answer: 

Ya, so Facebook has 1.23 Billion daily active people. So the viewing of live streams has to be able to scale to all those users.

Question: 
What is the focus of your QCon London talk?
Answer: 

I will answer questions like:

  • What is Facebook Live?
  • How did it come about?
  • Why it is important?
  • How did we architect it?
  • What were some of the interesting challenges we had to solve?
  • What's new with Facebook LIve

There will be things like where did we start, and how did we reach where we are. I’ll also focus on some of the interesting problems we had to solve along the way, and how we are planning for changes in the future. I will start at a high-level then drop into the guts of the infrastructure powering Facebook Live.

Question: 
QCon: From a high-level, can you describe a bit about the architecture?
Answer: 

We divide the Facebook Live part into three aggregate components. The first one is the ingestion. This is where the broadcast stream comes into Facebook. Second is the server-side processing (or encoding). Finally, the third one is being able to send this out to billions of devices so people can start seeing the stream.

The problems in each of the these are fairly different. For example, on the ingestion side a huge problem we have to address is the network connectivity on the client’s phone. So if I’m trying to go live from a village in India, I likely don’t have enough connectivity. So how do we adjust the quality to the point we send the highest possible quality stream out. That becomes an interesting problem. Another example is on the other side (the distribution side) when we send the stream out to the CDN. Here there is a very real possibility of thundering herds. It is hard to figure out which stream is going to become viral. So we can’t really prepare for it ahead of time. It’s a spontaneous medium. These are a couple of examples of problems I will discuss.

Question: 
QCon: What is the core persona for the talk you’re planning to give?
Answer: 

I think that architects and tech leads will find the talk very interesting. Also I will talk about how we went from a prototype to a full production ready system launched to the entire world in four months. So people like leadership and program managers will also be interested in how we pulled that off. This portion will also give some insight into Facebook culture.

Speaker: Sachin Kulkarni

Engineering Director @Facebook focused on Live, Videos, and Messenger

Sachin is the Director of Engineering for the infrastructure teams that build Facebook Live, Facebook Videos and Messenger. He ran the team that built the backend for Facebook Live from scratch in 3 months and scaled it to 1 billion daily active people. He oversaw the rearchitecture for the backend of Facebook videos to enable all aspects of uploads, broadcasting and delivery of all video content served through the Facebook app, Instagram as well as Messenger. Sachin's team built the infrastructure for Messenger which now supports 1 billion monthly active people and upwards of 15 billion messages per day. This backend is geographically distributed across the world to enable very low latency messaging. Before that, Sachin managed Wormhole, which is the Facebook equivalent of Apache Kafka. Wormhole is a real time data feed for *all* updates on Facebook and is used to power all ETL for Facebook. In the past, he was an engineer on Tao which is a social graph database that stores and caches all Facebook user data and is responsible for the complete Facebook user experience. If you like something on Facebook or upload a photo or comment, you have relied on Tao. Tao is responsible for billions of queries per second.

Find Sachin Kulkarni at

Similar Talks

Office of the CTO @MuleSoft
Developer Advocate, JFrog
Production Engineer @ Facebook's Web Foundation team
CTO who understands the science around helping people do their best

Tracks

Conference for Professional Software Developers