Presentation: "Distributed Systems, Databases, and Resilience"

Time: Thursday 12:05 - Friday 13:05

Location: St James’s Suite, Fourth Floor

Abstract:

Everything breaks. Power supplies, spinning disks, volatile memory, LAN and WAN networks, schedulers, caches... everything. Riak is a durable distributed database, one kind of software that can be directly affected by all of these and more. Despite these many avenues of potential failure, Riak promises (and delivers!) very high availability to its users. Justin will discuss the architectural and implementation choices in Riak that make this possible.

Together we will walk through a number of examples of real failures, and use Riak's techniques as a guide for thinking about how best to continue a system's overall successful operation (or if necessary, how to fail gracefully) in the face of those failures.

We will face the reality that anything we depend on can fail, and in the worst possible combination. We will learn how to build systems that let us happily sleep at night despite this troubling realization.

Download slides

Justin Sheehy, CTO of Basho Technologies

 Justin  Sheehy
Justin Sheehy is the CTO of Basho Technologies, the company behind the creation of Webmachine and Riak. Most recently before Basho, he was a principal scientist at the MITRE Corporation and a senior architect for systems infrastructure at Akamai. At both of those companies he focused on multiple aspects of robust distributed systems, including scheduling algorithms, language-based formal models, and resilience.