Presentation: Cloud-based Microservices powering BBC iPlayer

Location:

Duration

Duration: 
10:35am - 11:25am

Day of week:

Key Takeaways

  • Hear the narrative of how BBC moved from the Olympics of 2012 through today’s cloud-native, microservice based architecture.
  • Understand how an organization more based on the Command part of the CQRS pattern leveraged cloud service to achieve higher throughput and reliability in a Microservice-based architecture.
  • Hear some of the lessons learned and tips that the BBC learned migrating to cloud including: breaking your system down, learning to plan for failure, leveraging SQS, and more.

Abstract

In 2012 the system that got video into iPlayer was a monolith. In the nine months between Jan 2013 and Sept 2013 we replaced it with a new system we called Video Factory. It uses a microservices architecture and runs in AWS.

Video Factory has allowed us to more than double the amount of video available in iPlayer, extend availability from 7 days to 30 days and increase the amount of HD content by more than 700%. It also allows us to scale for sudden spikes in video going into iPlayer during events like Glastonbury and Wimbledon.

At the same time it has allowed us to move to a continuous delivery model and our developers can now deploy a component to live in under 15 minutes and perform dozens of live deployments every week.

In 2014 we started serving simulcast content (on-line versions of our TV channels) from Video Factory and during 2015 we moved radio content into this new system. For the first time unifying the BBC's systems for handling Audio and Video content online. We also added support for Podcasts, Audio and Video Clip Publishing, BBC Store and S4C. Recently we completely updated Video Factory to enable MPEG-DASH and add better support for HLS. This allows BBC iPlayer to make audio and video playable in modern browsers and Android devices without the need for Flash or Air.

In this talk Stephen Godwin describes how the BBC integrated its broadcast systems with AWS, how Video Factory is built around a microservices architecture that uses both REST and SQS and how this has allowed new features to be added and large changes to be made without interruption to the normal operation of iPlayer.

Interview

Question: 
What is your role today?
Answer: 
I’m Senior Technical Architect at the BBC, and I’m responsible for designing the systems that publish all the BBC's online media.
Question: 
For those who may not be familiar, what is in the Digital Media space for the BBC?
Answer: 
So the Digital Media area that I work in includes: BBC iPlayer, iPlayer Radio, Podcast, and Clips (trailers for things like Doctor Who that goes on the website as well). We’re responsible for creating the media that supports all of the different versions of the media playback on things ranging from game consoles to smart TV’s from phones/tablets to PC/Macs.
Question: 
The architecture you plan to describe in your talk is it on-prem or a cloud based architecture?
Answer: 
It’s a cloud-based architecture that has a very small on-prem footprint to integrate with the broadcast chains. Apart from the small footprint for broadcast chains (and some of the failure scenarios/reserves we have on premise) our architecture is entirely a cloud-based architecture now.
We moved to this cloud-based architecture in 2013. We actually designed the initial version over 9 months. While the architecture has grown quite a lot since then, our initial system was a complete replacement for the system that was powering iPlayer. Previously, it was an onsite poll-based monolithic app that wasn’t terribly stable at the time and wasn’t really able to cope with the amount of content that we needed to publish.
Question: 
What is the motivation for your talk?
Answer: 
I’m going to talk about how we solved the problem of a monolithic system and moving into the cloud.
For example, back then we had problems of overcapacity. We were only able to do about 20 hours of HD content at a week (which is tiny). We got into that position because the previous system (which was physical hardware) was designed with a physical limit in mind 5 years earlier. In that 5 years. we saw tablets,mobiles, and things come along that became a large part of the iPlayer audience. Yet, we had a system that was designed for what the audience was 4 or 5 years before.
So part of what I’ll discuss is how we designed a system that would scale linearly. The idea is you could actually give someone a rate card and say if you want to double the amount of content, here’s what it will cost you. Ideally, that would be a linear scaling cost.
I will also about problems where it took us a fortnight to deploy code change to production. It was quite frustrating. Some bugs didn’t get fixed because of that sort of behavior. So, as we designed this new system, we wanted something where we could deploy small changes quickly, rather than taking a big bang approach (where everything catches fire at once). I will discuss how our current approach really lets us deploy without people noticing.
Question: 
What kind of things are you going to discuss in your talk that I might run into as I move from an on-prem architecture to a cloud native architecture?
Answer: 
So I think there are some basic things that I think when you are moving to a cloud based architecture that you often go for.
Things like breaking your system down into smaller components. That helps if you have some of the same objectives that we had like being able to scale very easily. If you have lots of small components, it’s easier to scale different parts of the system. That puts certain design considerations on you, so you need to think upfront about them. These are things like how you have to monitor, track, and debug. I’ll discuss questions like how do you go about doing that? Getting that in from day 1 is really useful.
I also talk about the challenges to moving to a continuous delivery model and some of the best practices we’ve discovered.
Some other advice will be around expecting things to fail when you are building a distributed system.
Question: 
What are some of the takeaways that someone will gain from your talk?
Answer: 
Design your system linearly, expect things to fail, avoid big bang, design for migrations are some of the big ones.
For our use case, we have more of a backend integration type system. So I will discuss how our cloud-based architecture allows us to solve how we handle the thousands of hours of content that the BBC produces. If you think CQRS model, we are much more of a command oriented problem than a query oriented problem. While there is certainly a query part around how we distribute media out that is certainly non-trivial, some of the really interesting parts of the system I will discuss are around the command part of the system.

Tracks

Covering innovative topics

Monday, 7 March

Tuesday, 8 March

Wednesday, 9 March

Conference for Professional Software Developers