Presentation: Streaming a Million likes/second: Real-time Interactions on Live Video

Track: Streaming Data Architectures

Location: Churchill, G flr.

Duration: 11:50am - 12:40pm

Day of week: Monday

Share this on:

What You’ll Learn

  1. Find out about LinkedIn’s Real-time Distribution Platform, what it does and how it does it.
  2. Learn how to scale up to millions of users globally.

Abstract

When a broadcaster like BBC streams a live video on LinkedIn, tens of thousands of viewers will watch it concurrently. Typically, hundreds of likes on the video will be streamed in real-time to all of these viewers. That amounts to a million likes/second streamed to viewers per live video. How do we make this massive real-time interaction possible across the globe? In this talk, I’ll do a technical deep-dive into how we use the Play/Akka Framework and a scalable distributed system to enable live interactions like likes/comments at massive scale at extremely low costs across multiple data centers.

Topics I will cover include:

  • Server-side and client-side frameworks for persistent connections.
  • Managing persistent connections with millions of active clients.
  • Pub/Sub architecture for real-time streaming with less than 100ms end to end latency to millions of connected clients. Hint: No Kafka!
  • Leveraging the same platform for other dynamic experiences like Presence.
Question: 

What is the work you're doing today?

Answer: 

I'm the Tech Lead for LinkedIn Messaging and LinkedIn’s Real-time Distribution Platform. This is a platform that we use to deploy server-to-client streaming technology to power many dynamic experiences on LinkedIn. This includes instant distribution of likes, comments and concurrent viewer counts on live videos, instant messaging, typing indicators, seen receipts, and even online presence, those green online indicators that you see when you message someone.

Question: 

What are the goals you have for the talk?

Answer: 

The talk is centered around the platform that I just described, which supports real-time distribution of likes, comments, concurrent viewer counts, and notifications to millions of connected viewers that are watching live videos on LinkedIn at any given time. I have a couple of goals. I want to get the audience really excited about just the importance of dynamic interactive experiences in their apps. So these days, Instagram, Twitch, Facebook, LinkedIn learning, they're all trying to go towards this concept of getting people to interact with each other, getting people to learn from each other, especially LinkedIn. In the professional context, we want people to learn from each other, build those networks and connect with each other. I want to get all the audience excited about how they can apply such technology to their apps. Secondly, I want to do that by introducing them to the fundamental building blocks that you would need to build such a system. Fundamentally, I believe that the building blocks that you need is a persistent connection with your clients, which allows you to actually stream data to the clients, some methodology of allowing clients to subscribe to the topics that they're interested in and a methodology to allow publishers to publish to those topics so that you can stream relevant data down to the clients at scale. And thirdly, I want to discuss challenges in distributed systems like and how to solve these challenges by starting small and then adding layers of simple architecture to reach a massive scale. There's no magic there. I will do that by sharing real practical experiences that we had in doing so.

Question: 

In the abstract, you mention you're using Akka and Play. Why were those particular frameworks used?

Answer: 

The biggest reason is scale. Both Play and Akka are completely asynchronous event-driven frameworks, which means that there are no blocking operations or shared states anywhere. That's the fundamental reason we did so. Play, specifically, is a completely asynchronous web server framework that allows you to use a very small number of threads to serve a large number of requests because a thread is used only when you're doing some work in each of the respective requests. Akka enables a concurrent, message-driven system with the concept of actors. Actors have state which is modified only via messages, and therefore each actor can do a little task without having to worry about what is happening in the rest of the system. A thread is used only while processing these messages and re-assigned to the next actor that needs it when idle. So, a small number of threads can serve a large number of actors. In my case, I use it for maintaining these connections. Each connection is maintained by one actor and you're able to scale very effectively because those actors are working independently and serving those connections independently only when activity happens.

Question: 

Do you think with the actor model is easier to reason about let's say threading as an example?

Answer: 

Absolutely. The other alternative is to have, for example, scheduled executor services and thread pools to manage these connections. The thing that you start to struggle with there is managing shared state, sizing the thread pool, making sure that it has sufficient resources and preventing starvation. And as the system grows, having a thread dedicated to each connection can result in very poor scaling characteristics. If you don't do it in the specific context that actors do it in, which is to pass messages to each other, act only when there is something to process and have no concept of a shared state across these actors, these things become really hard to scale.

Question: 

What do you want people to leave the talk with?

Answer: 

As I said above, I think I really want the audience to walk away with practical advice because we built this system from the ground up and I want to show real examples of how we did that and how we layered on top of the simple systems that we built. Practical advice for building distributed systems that can support the distribution of events to millions of connected clients, and go and apply this to their applications directly. Secondly, I would like them to see how seemingly impossible scale can be achieved with simple building blocks. You take these building blocks, use powerful asynchronous frameworks like Play and Akka, and suddenly you're able to scale to a system that can serve viewers across the globe.

Speaker: Akhilesh Gupta

Sr. Staff Software Engineer @LinkedIn

Akhilesh is the technical lead for LinkedIn's Real-time delivery infrastructure and LinkedIn Messaging. He has been working on the revamp of LinkedIn’s offerings to instant, real-time experiences. Before this, he was the head of engineering for the Ride Experience program at Uber Technologies in San Francisco. He holds a Master's degree in CS from Stanford University.

Find Akhilesh Gupta at

Similar Talks

Scaling N26 Technology Through Hypergrowth

Qcon

Software Engineer and Tech Lead @N26

Folger Fonseca

Monitoring All the Things: Keeping Track of a Mixed Estate

Qcon

Principal Engineer Operations and Reliability Programme @FT

Luke Blaney

3 Disciplines for Leading a Distributed Agile Organization

Qcon

Distributed Coach/Mentor & Community Cultivator

Mark Kilby

Why Distributed Systems Are Hard

Qcon

Software Engineer @Pivotal

Denise Yu

A Brief History of the Future of the API

Qcon

Co-Author of gRPC for WCF Developers and Creator @VisualRecode

Mark Rendle

Preparing for the Unexpected

Qcon

Principal Engineer @FinancialTimes

Samuel Parkinson

Security Vulnerabilities Decomposition

Qcon

Principal Application Security Consultant @Veracode

Katy Anton

Tracks

  • Architectures You've Always Wondered About

    Hard-earned lessons from the names you know on scalability, reliability, security, and performance.

  • Machine Learning: The Latest Innovations

    AI and machine learning is more approachable than ever. Discover how ML, deep learning, and other modern approaches are being used in practice.

  • Kubernetes and Cloud Architectures

    Learn about cloud native architectural approaches from the leading industry experts who have operated Kubernetes and FaaS at scale, and explore the associated modern DevOps practices.

  • Evolving Java

    JVM futures, JIT directions and improvements to the runtimes stack is the theme of this year’s JVM track.

  • Next Generation Microservices: Building Distributed Systems the Right Way

    Microservice-based applications are everywhere, but well-built distributed systems are not so common. Early adopters of microservices share their insights on how to design systems the right way.

  • Chaos and Resilience: Architecting for Success

    Making systems resilient involves people and tech. Learn about strategies being used, from cognitive systems engineering to chaos engineering.

  • The Future of the API: REST, gRPC, GraphQL and More

    The humble web-based API is evolving. This track provides the what, how, and why of future APIs.

  • Streaming Data Architectures

    Today's systems move huge volumes of data. Hear how the innovators in this space are designing systems and leveraging modern data stream processing platforms.

  • Modern Compilation Targets

    Learn about the innovation happening in the compilation target space. WebAssembly is only the tip of the iceberg.

  • Modern CS in the Real World

    Head back to academia to solve today's problems in software engineering.

  • Bare Knuckle Performance

    Crushing latency and getting the most out of your hardware.

  • Leading Distributed Teams

    Remote and distributed working are increasing in popularity, but many organisations underestimate the leadership challenges. Learn from those who are doing this effectively.

  • Driving Full Cycle Engineering Teams at Every Level

    "Full cycle developers" is not just another catch phrase; it's about engineers taking ownership and delivering value, and doing so with the support of their entire organisation. Learn more from the pioneers.

  • JavaScript: Pushing the Client Beyond the Browser

    JavaScript is not just the language of the web. Join this track to learn how the innovators are pushing the boundaries of this classic language and ecosystem

  • When Things Go Wrong: GDPR, Ethics, & Politics

    Privacy, confidentiality, safety and security: learning from the frontlines, from both good and bad experiences

  • Growing Unicorns in the EU: Building, Leading and Scaling Financial Tech Start Ups

    Learn how EU FinTech innovators have designed, built, and led both their technologies and organisations.

  • Building High Performing Teams

    There are many discussions outlining the secret sauce of high-performing teams. Learn how to balance the essential ingredients of high performing teams such as trust and delegation, as well as recognising the pitfalls and problems that will ruin any recipe.

  • Scaling Security, from Device to Cloud

    Implementing effective security is vitally important, regardless of where you are deploying software applications