Presentation: #NetflixEverywhere Global Architecture

Location:

Duration

Duration: 
11:50am - 12:40pm

Day of week:

Key Takeaways

  • Hear the Netflix narrative about technical strategy, patterns, outcomes, and challenges to be overcome as Netflix moved to a globally available, ubiquitous service provider.
  • Understand how Netflix evolved the architecture to embrace failure and recover quickly from outages in a relatively short period of time.
  • Learn concrete, reusable patterns applicable to anyone operating in a Virtual Data Center like AWS.

Abstract

On December 24th, 2012 ASW US-EAST1 experienced a region-wide failure that took down the Netflix service for almost 24 hours. Knowing that failure is inevitable in any complex system we evolved our cloud-based, micro-service architecture to support multi-region traffic management and failover capabilities. With that foundation in place we drove initiatives to achieve service ubiquity and rapid global expansion. The overarching theme is #NetflixEverywhere - an amazing, global, highly available movie and TV streaming experience for any member, anytime, on any device, anywhere in the world.

Building and evolving a pervasive, global service requires a multi-disciplined approach that balances requirements around service availability, latency, data replication, compute capacity, and efficiency. In this session, we’ll follow the Netflix journey of failure, innovation, and ubiquity. We'll review the many facets of globalization then delve deep into the architectural patterns that enable seamless, multi-region traffic management, reliable, fast data propagation, and efficient service infrastructure. The patterns presented will be broadly applicable to internet services with global aspirations.

Interview

Question: 
QCon: At CES 2016, Reed Hastings announced “the birth of a new global Internet TV network.” It was all over twitter with the #netflixeverywhere. What does this really mean to Netflix?
Answer: 
Josh: At the end of the day, we’re here to deliver amazing content to our members. Our strategy is to leverage our scale over time. Now that we’re global, and the economics favor global licensing, we can jump in with both feet, making content available everywhere in the world and, in many cases, at the same time. Take “Marvel's Daredevil” which we’ll release on March 18. When we release it, it will be all episodes at the same time everywhere Netflix is available. That’s our goal. Great stories transcend borders, and we want our content to be ubiquitous. Ubiquity is a big theme of this talk.
Global insight is also very important for Netflix. Now that we are in those regions, we can see where the problems are and how problems manifest. Until you are global, you are just guessing about the problems you’ll run into. From a business perspective, knowing where the interest lies allows us to understand where there are high leverage opportunities for improvement. Examples of this are evaluating local payment options or the importance of investing in local content. From a technical perspective we have a huge opportunity to understand global latency and reliability. Going global gives us a chance to learn. These new global insights are why we talk about it being the beginning of the process.
Question: 
QCon: Are you still running global operations from the traditional AWS data centers?
Answer: 
Josh: Yes, absolutely. AWS is foundational to our global strategy. In fact, we’re deepening that investment this year by starting our migration from EC2 Classic to VPC. We’re looking forward to the performance improvements and compute capabilities available in that environment.
Question: 
QCon: You mention resiliency patterns in the context of Netflix architecture. Does this talk get into what that means?
Answer: 
Josh: In detail. Resiliency patterns easily comprise half of the talk. They have largely defined the architecture we have today. For example, we will discuss failover patterns like bypassing elastic load balancers, how and why we do regional failovers, and ultimately how we arrived at the multi-region active-active environment Netflix has today. There are some cool visuals along the way, too.
Question: 
QCon: I’m curious, failure seems to be such a big part of your talk. Why isn’t failure the focus in your title?
Answer: 
Josh: As many people know, failure is fundamental to how Netflix thinks about architecture. But the real story here is about how failure drove us to build a robust platform that is ready to support a truly global service. This talk covers our failures and the resulting journey led that us to #netflixeverywhere.
Question: 
QCon: Now that Netflix is global, what’s next?
Answer: 
Josh: Our CEO Reed Hastings said recently of our global launch: “It’s just like having a baby.” We are going to spend the next 20 years learning and improving upon what we’ve done. Now that we have a global presence, we can deepen our understanding and let that guide us forward. Now we can see where the activity is, where the interest is, and where we should invest our energy. A global Netflix gives us great insight that we wouldn’t otherwise have. So what’s next? We’ll see where the data leads us.

Tracks

Covering innovative topics

Monday, 7 March

Tuesday, 8 March

Wednesday, 9 March

Conference for Professional Software Developers