There is a growing trend of databases specializing in derived data ingestion and serving. They complement more traditional “primary data” (or “source of truth”) systems. This talk proposes a new vocabulary for navigating the evolving landscape of data systems and exploring their tradeoffs.
Besides explaining what derived data is, we will also dive into four major use cases which fit in the derived data bucket, including: graphs, search, OLAP and ML feature storage. We will explore the systems which support these use cases at LinkedIn, including open source solutions such as Pinot, which powers Who Viewed My Profile, and Venice, which powers People You May Know.
Speaker
Felix GV
Principal Staff Engineer @LinkedIn
Felix joined LinkedIn's data infrastructure team in 2014, first working on Voldemort, the predecessor of Venice. Over the years, Felix participated in all phases of the development lifecycle of Venice, from requirements gathering and architecture, to implementation, testing, roll out, integration, stabilization, scaling and maintenance.