A New Era for Database Design with TigerBeetle

The pre-recorded video of this presentation will become available within the next few hours.

 

There was a time when the world could afford only a handful of open source relational database management systems. You could have any database, as long as it was MySQL or Postgres. These systems took 30 years to develop and test, and enjoyed widespread adoption.

And then something happened. It almost passed us by. 2018 drew a line in the sand, and marked the end of an era.

Darkest Before the Dawn
The foundations of database design were rocked both by the discovery of fsyncgate in the Postgres community, and the “slow train coming” of storage fault research from UW-Madison. Not only was the Linux kernel page cache no longer trustworthy, but the write-ahead log designs of most databases (take your pick!) were found, at least in the research, to be fundamentally broken, even to the extent of a single disk sector fault cascading into global cluster data loss.

Mission-critical applications were moving to proprietary cloud databases, and our favorite open source databases were left behind, stuck in the twilight of single-node availability and manual failover. We were yet to face the fallout. And then came the dawning of a new era.

First Gleam of Dawn
The first ray of light was FoundationDB, already pioneering the deterministic simulation testing of distributed databases, to inject faults, to find and replay rare bugs for increased developer velocity, and to test in months what had before taken decades.

Next, Linux underwent a revolution in asynchronous I/O, saved from the complexity of kernel-bypass techniques such as SPDK and DPDK by Jens Axboe's io_uring.

Full Light of Day
Finally, safer languages such as Rust and Zig rose up to power the next 30 years of systems, making it easier to write correct code, bringing gamechangers like comptime, and reviving the lost art of static memory allocation, for predictable performance in the extreme.

In this talk, we will look at these pivotal moments, and how they influenced our design decisions for TigerBeetle, the distributed financial accounting database for mission-critical safety and performance, developed under the Apache 2.0 open source license.

Taken together, these advances in database design will unlock an abundance of new open source database management systems, tailored to their domain. The best is yet to come.


Speaker

Joran Greef

Founder and CEO @TigerBeetle

Joran Dirk Greef is the Founder and CEO of TigerBeetle, the distributed financial accounting database designed for mission-critical safety and performance. His interests are storage, speed, and safety.

Read more
Find Joran Greef at:

Date

Monday Mar 27 / 10:35AM BST ( 50 minutes )

Location

Mountbatten (6th Fl.)

Share

From the same track

Session Microservices

Change Data Capture for Microservices

Monday Mar 27 / 01:40PM BST

Microservices represent complex business domains in the form of loosely coupled systems, but these don't exist in isolation: services need to propagate data changes amongst each other, in a reliable and scalable way.

Speaker image - Gunnar Morling

Gunnar Morling

Senior Staff Software Engineer @Decodableco

Session transactions

Amazon DynamoDB Distributed Transactions at Scale

Monday Mar 27 / 02:55PM BST

NoSQL databases are popular for their high availability, high scalability, and predictable performance.

Speaker image - Akshat Vig

Akshat Vig

Senior Principal Engineer NoSQL databases @awscloud

Session Apache Pinot

Speed of Apache Pinot at the Cost of Cloud Object Storage with Tiered Storage

Monday Mar 27 / 11:50AM BST

For real-time analytics, you need systems that can provide ultra low latency (milliseconds) and extremely high throughput (hundreds of thousands of queries per second).

Speaker image - Neha Pawar

Neha Pawar

Founding Engineer @StarTree

Session processing techniques

In-Process Analytical Data Management with DuckDB

Monday Mar 27 / 05:25PM BST

Analytical data management systems have long been monolithic monsters far removed from the action by ancient protocols. Redesigning them to move into the application process greatly streamlines data transfer, deployment, and management.

Speaker image - Hannes Mühleisen

Hannes Mühleisen

Co-founder and CEO @duckdblabs

Session raft

Multi-Region Data Streaming with Redpanda

Monday Mar 27 / 04:10PM BST

Real time data streaming platforms such as Redpanda have become a mission critical component in enterprise infrastructure. Multi-region deployments of streaming applications can provide important benefits, such as improved resiliency, better performance and cost reduction.

Speaker image - Michał Maślanka

Michał Maślanka

Software Engineer @Redpanda