Presentation: Causal Consistency For Large Neo4j Clusters

Location:

Duration

Duration: 
5:25pm - 6:15pm

Day of week:

Level:

Persona:

Key Takeaways

  • Understand what is meant by Causal clustering, and it's application
  • Understand how to build applications with Raft
  • Decipher whether you can you mix asynchronous replication for scale with Raft for safety (yes!)

Abstract

In this talk we'll explore the new Causal clustering architecture for Neo4j. We'll see how Neo4j uses the Raft protocol for a robust underlay for intensive write operations, and how the asynchronous new scale-out mechanism provides enormous capacity for very demanding graph workloads.

We'll discuss the cluster architecture's new causal consistency model. Causal consistency is a big leap forward compared to the commonplace eventual consistency which makes it convenient to write applications that use the full capacity of the cluster. In particular we'll show how despite the mixture of concensus protocols and asynchronous replication, that Neo4j allows users to read their own writes straightforwardly and discuss why this is such a difficult achievement in distributed systems.

For the application developer, we'll show how Neo4j's Causal Clustering optimised drivers makes it easyto write applications that scale smoothly from a single server to a large, distributed cluster: a practical motivation for the distributed systems enthusiast.

Interview

Question: 
What is the focus of your work today?
Answer: 

I work on the Neo4j graph database, and focus on fault tolerance and scale generally, and have recently been working on consensus protocols and strong consistency models.

Question: 
What’s the motivation for your talk?
Answer: 

Eventual consistency is great for databases but less good for developers. Strong consistency is the opposite. I’ve been working on some database middleware that provides causal consistency - quite strong, but still allowing large scale.

Question: 
How you you describe the persona of the target audience of this talk?
Answer: 

People who are interested in distributed systems, especially databases. Also those who are planning to deploy databases. It’s a technical-ish talk so anyone in that kind of role might enjoy it.

Question: 
How would you rate the level of this talk?
Answer: 

Beginner. If you try to do complex distributed systems talks in 50 minutes, you’ll fail. This talk is deliberately accessible, partly as an homage to the design of the key protocol it uses - Raft - which strives to be humane where other protocols have tried to be clever.

Question: 
QCon targets advanced architects and sr development leads, what do you feel will be the actionable that type of persona will walk away from your talk with?
Answer: 

They will be able to build a system with Neo4j after this.

Question: 
What do you feel is the most disruptive tech in IT right now?
Answer: 

I’m biased, but I think it’s neo4j. Longer term I actually think hardware architecture is going to stop being dull and start being radical again.

Speaker: Jim Webber

Chief Scientist @Neo4j

Dr. Jim Webber is Chief Scientist with Neo Technology the company behind the popular open source graph database Neo4j, where he where he works on R&D for highly scalable graph databases and writes open source software. Jim has written two books on integration and distributed systems: “Developing Enterprise Web Services” on XML Web Services and “REST in Practice” on using the Web for building large-scale systems. His latest book is “Graph Databases” which focuses on the Neo4j database.

Find Jim Webber at

Similar Talks

Senior Solutions Engineer @Couchbase
Solution Architect @Redis Labs, Inc
CTO who understands the science around helping people do their best
Senior Software Engineer @IBM, Committer on Apache Aries
Distributed Systems Engineer Working on Cache @Twitter
Gold Badges Java, JVM, Memory, & Performance @StackOverflow / Lead developer of the OpenHFT project

Tracks

Conference for Professional Software Developers