What is Derived Data? (And do You Already Have Any?)

There is a growing trend of databases specializing in derived data ingestion and serving. They complement more traditional “primary data” (or “source of truth”) systems. This talk proposes a new vocabulary for navigating the evolving landscape of data systems and exploring their tradeoffs.

Besides explaining what derived data is, we will also dive into four major use cases which fit in the derived data bucket, including: graphs, search, OLAP and ML feature storage. We will explore the systems which support these use cases at LinkedIn, including open source solutions such as Pinot, which powers Who Viewed My Profile, and Venice, which powers People You May Know.


Speaker

Felix GV

Principal Staff Engineer @LinkedIn

Felix joined LinkedIn's data infrastructure team in 2014, first working on Voldemort, the predecessor of Venice. Over the years, Felix participated in all phases of the development lifecycle of Venice, from requirements gathering and architecture, to implementation, testing, roll out, integration, stabilization, scaling and maintenance.

Read more

Date

Monday Mar 27 / 05:25PM BST ( 50 minutes )

Location

Churchill (Ground Fl.)

Topics

Data case study Graphs search OLAP ML Feature Storage Data Systems

Share

From the same track

Session consistency

Eventual Consistency – Don’t Be Afraid!

Monday Mar 27 / 10:35AM BST

Distributed data-intensive systems are increasingly designed to be only eventually consistent.

Speaker image - Susanne Braun
Susanne Braun

Principal Tech Lead @SAPSignavio

Session cloud native

The Commoditization of the Software Stack: How Application-first Cloud Services are Changing the Game

Monday Mar 27 / 11:50AM BST

The runtime boundaries between applications and the cloud are shifting from virtual machines to containers and functions. The integration boundaries are moving away from pure data access to one where the mechanical parts of the application are running within the cloud.

Speaker image - Bilgin Ibryam
Bilgin Ibryam

Principal Product Manager @Diagrid, Co-author of “Kubernetes Patterns“, Previously Architect @RedHat

Session OpenTelemetry

Effective and Efficient Observability with OpenTelemetry

Monday Mar 27 / 02:55PM BST

Modern architectures require effective observability solutions to be able to monitor their health and understand how system changes affect operations distributed across multiple services.

Speaker image - Daniel Gomez Blanco
Daniel Gomez Blanco

Principal Engineer @Skyscanner

Session api

Connecting the Dots: API Design in a Distributed World

Monday Mar 27 / 04:10PM BST

As we’ve gone from building monoliths to building microservices, the number of APIs we’ve got to manage has gone from just the database and front end, to at least one per service. 

Speaker image - Ben Gamble
Ben Gamble

Adviser, Architect & Speaker About Interactive Technology, Startups & Event Driven Systems

Session

Unconference: Building Modern Backends

Monday Mar 27 / 01:40PM BST

What is an unconference? An unconference is a participant-driven meeting. Attendees come together, bringing their challenges and relying on the experience and know-how of their peers for solutions.

Speaker image - Shane Hastie
Shane Hastie

Global Delivery Lead @SoftEd, Lead Editor for Culture & Methods @InfoQ