You are viewing content from a past/completed QCon -

Presentation: Streaming a Million likes/second: Real-time Interactions on Live Video

Track: Streaming Data Architectures

Location: Churchill, G flr.

Duration: 11:50am - 12:40pm

Day of week:

Slides: Download Slides

This presentation is now available to view on InfoQ.com

Watch video with transcript

What You’ll Learn

  1. Find out about LinkedIn’s Real-time Distribution Platform, what it does and how it does it.
  2. Learn how to scale up to millions of users globally.

Abstract

When a broadcaster like BBC streams a live video on LinkedIn, tens of thousands of viewers will watch it concurrently. Typically, hundreds of likes on the video will be streamed in real-time to all of these viewers. That amounts to a million likes/second streamed to viewers per live video. How do we make this massive real-time interaction possible across the globe? In this talk, I’ll do a technical deep-dive into how we use the Play/Akka Framework and a scalable distributed system to enable live interactions like likes/comments at massive scale at extremely low costs across multiple data centers.

Topics I will cover include:

  • Server-side and client-side frameworks for persistent connections.
  • Managing persistent connections with millions of active clients.
  • Pub/Sub architecture for real-time streaming with less than 100ms end to end latency to millions of connected clients. Hint: No Kafka!
  • Leveraging the same platform for other dynamic experiences like Presence.
Question: 

What is the work you're doing today?

Answer: 

I'm the Tech Lead for LinkedIn Messaging and LinkedIn’s Real-time Distribution Platform. This is a platform that we use to deploy server-to-client streaming technology to power many dynamic experiences on LinkedIn. This includes instant distribution of likes, comments and concurrent viewer counts on live videos, instant messaging, typing indicators, seen receipts, and even online presence, those green online indicators that you see when you message someone.

Question: 

What are the goals you have for the talk?

Answer: 

The talk is centered around the platform that I just described, which supports real-time distribution of likes, comments, concurrent viewer counts, and notifications to millions of connected viewers that are watching live videos on LinkedIn at any given time. I have a couple of goals. I want to get the audience really excited about just the importance of dynamic interactive experiences in their apps. So these days, Instagram, Twitch, Facebook, LinkedIn learning, they're all trying to go towards this concept of getting people to interact with each other, getting people to learn from each other, especially LinkedIn. In the professional context, we want people to learn from each other, build those networks and connect with each other. I want to get all the audience excited about how they can apply such technology to their apps. Secondly, I want to do that by introducing them to the fundamental building blocks that you would need to build such a system. Fundamentally, I believe that the building blocks that you need is a persistent connection with your clients, which allows you to actually stream data to the clients, some methodology of allowing clients to subscribe to the topics that they're interested in and a methodology to allow publishers to publish to those topics so that you can stream relevant data down to the clients at scale. And thirdly, I want to discuss challenges in distributed systems like and how to solve these challenges by starting small and then adding layers of simple architecture to reach a massive scale. There's no magic there. I will do that by sharing real practical experiences that we had in doing so.

Question: 

In the abstract, you mention you're using Akka and Play. Why were those particular frameworks used?

Answer: 

The biggest reason is scale. Both Play and Akka are completely asynchronous event-driven frameworks, which means that there are no blocking operations or shared states anywhere. That's the fundamental reason we did so. Play, specifically, is a completely asynchronous web server framework that allows you to use a very small number of threads to serve a large number of requests because a thread is used only when you're doing some work in each of the respective requests. Akka enables a concurrent, message-driven system with the concept of actors. Actors have state which is modified only via messages, and therefore each actor can do a little task without having to worry about what is happening in the rest of the system. A thread is used only while processing these messages and re-assigned to the next actor that needs it when idle. So, a small number of threads can serve a large number of actors. In my case, I use it for maintaining these connections. Each connection is maintained by one actor and you're able to scale very effectively because those actors are working independently and serving those connections independently only when activity happens.

Question: 

Do you think with the actor model is easier to reason about let's say threading as an example?

Answer: 

Absolutely. The other alternative is to have, for example, scheduled executor services and thread pools to manage these connections. The thing that you start to struggle with there is managing shared state, sizing the thread pool, making sure that it has sufficient resources and preventing starvation. And as the system grows, having a thread dedicated to each connection can result in very poor scaling characteristics. If you don't do it in the specific context that actors do it in, which is to pass messages to each other, act only when there is something to process and have no concept of a shared state across these actors, these things become really hard to scale.

Question: 

What do you want people to leave the talk with?

Answer: 

As I said above, I think I really want the audience to walk away with practical advice because we built this system from the ground up and I want to show real examples of how we did that and how we layered on top of the simple systems that we built. Practical advice for building distributed systems that can support the distribution of events to millions of connected clients, and go and apply this to their applications directly. Secondly, I would like them to see how seemingly impossible scale can be achieved with simple building blocks. You take these building blocks, use powerful asynchronous frameworks like Play and Akka, and suddenly you're able to scale to a system that can serve viewers across the globe.

Speaker: Akhilesh Gupta

Sr. Staff Software Engineer @LinkedIn

Akhilesh is the technical lead for LinkedIn's Real-time delivery infrastructure and LinkedIn Messaging. He has been working on the revamp of LinkedIn’s offerings to instant, real-time experiences. Before this, he was the head of engineering for the Ride Experience program at Uber Technologies in San Francisco. He holds a Master's degree in CS from Stanford University.

Find Akhilesh Gupta at

Tracks

Discover some of the topics you will see at QCon London.

Architectures You've Always Wondered About

Hard-earned lessons from the names you know on scalability, reliability, security & performance.

Performance & Mechanical Sympathy

Wrangling computer architectures to achieve your performance goals.

Current Trends in Frontends

Emerging frameworks and tools for web & mobile frontends.

Developer Enablement - The Secret Weapon

How to supercharge developer productivity and happiness.

Innovations in ML Systems

Discover how to operationalize machine learning applications that are scalable, secure, interpretable.

Building and Evolving APIs

How would you design and implement your APIs if you were starting today?

Staff-Plus Engineer Path

What are the skills you need to develop if you want to stay on the technical track to technical lead, staff, or principal? What will it take to thrive in this type of role?

Debug, Analyze & Optimise... in Production!

Learn how to make the most out of your observability.

Resilient Architectures

Building systems that can handle failures.

Modern Java

New, shiny, and proven – making your Java life better.

Enterprise Blockchain

Using blockchain to deliver decentralized systems for the enterprise.

WebAssembly & Modern Systems Programming Languages

Writing efficient code with modern systems languages like Rust and WebAssembly

Next Generation Microservices: Building Distributed Systems the Right Way

Patterns for building, organizing, observing & operating microservices in the real world.

Modern Data Pipelines & DataMesh

Learn about emerging solutions for creating modern data pipelines for common data challenges.

Optimising for Speed & Flow

Discover processes, practices & organizational structures that can help us build & release faster.