Speaker: Dr. Einat Orr

(She / her / hers)

Co-creator of @lakeFS, Co-founder & CEO of Treeverse

Einat Orr has 20+ years of experience building R&D organizations and leading the technology vision at multiple companies, the latest being Similarweb, that IPO in NYSE last May.  Currently she serves as Co-founder and CEO of Treeverse, the company behind lakeFS, an open source platform that delivers a git like experience to object-storage based data lakes. She received her PhD. in Mathematics from Tel Aviv University, in the field of optimization in graph theory.

Find Dr. Einat Orr at:

Session + Live Q&A

Data Versioning at Scale: Chaos and Chaos Management

Version control is fundamental when managing code, but what about data? Our data changes over time, first since it accumulates, we have new data points for new points in time. But this is not the only reason. We also have additional data added to past time, since we were able to get additional data sources, or changed past data in light of new information that was late to arrive.

Since our data is mutable, version control of the data will allow us to ensure we can reproduce a set of results, provide us with the lineage between the input and output data sets of a process or a model, allow us to experiment, provide the relevant information for auditing, and assist us in production management. In this talk we will go over several technologies that version large data sets. We will understand the use cases they support and look under the hood at the technology developed to best support those use cases.

Date

Wednesday Apr 6 / 04:10PM BST (50 minutes)

Location

Churchill, G flr.

Track

Modern Data Pipelines & DataMesh

Topics

Data Engineering

Add to Calendar

Add to calendar

Share

Session + Live Q&A

Connecting Modern Data Pipelines and Data Products

The complexity of tools, distributed systems, and the CAP theorem introduce tradeoffs that practitioners cannot avoid or ignore as they embrace the world of modern data pipelines. What strategies can you employ? This is where data products come into play. Understanding the business objectives of data products helps us make informed decisions about tools, architecture, and services. Join this panel to learn from data thought leaders! 

Date

Wednesday Apr 6 / 11:50AM BST (50 minutes)

Location

Whittle, 3rd flr.

Track

Modern Data Pipelines & DataMesh

Topics

Data Engineering

Video

Video is not available

Slides

Slides are not available

Add to Calendar

Add to calendar

Share

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.