The Harsh Reality of Building a Realtime ML Feature Store

In a world where AI and ML are rapidly evolving, the need for efficient Realtime Feature Stores has never been greater. But the journey to create one is far from straightforward.

In the talk, Ivan Burmistrov will share how ShareChat - the largest social network in India - built their own Realtime Feature Store serving more than 1 billion features per second, and how they managed to make it cost-efficient.

Ivan will cover the challenges the team faced along the way, how they managed to overcome them and which ones are still not fully resolved. The talk will also cover the experience in using relatively new technologies such as ScyllaDB and RedPanda and why such technologies are crucial for building a cost efficient system. Additionally, Ivan will share how the system leverages Apache Flink in the very core of the data pipeline.

This talk will provide insights for anyone interested in real-time data pipelines and Realtime Feature Stores, in particular. 


From the same track

Session

Rockset - Building a Modern Analytics Database on Top of RocksDB

RocksDB, a key-value store built on the foundation of Log-Structured Merge-Tree data structures and originally open-sourced by Facebook, has played a significant role in shaping data systems over the past decades.

Speaker image - Igor Canadi
Igor Canadi

Founding Engineer and Architect @Rockset, Previously at RocksDB and Facebook

Session

Open Formats: The Happy Accident Disrupting the Data Industry

Analytic databases are quietly going through an unprecedented transformation. Open table formats, like Apache Iceberg, enable multiple query engines to share one central copy of a table.

Speaker image - Ryan Blue
Ryan Blue

Co-Founder and CEO @Tabular, Co-creator of Apache Iceberg

Session

Powering User Experiences with Streaming Dataflow

Streaming dataflow provides a unique solution to scaling OLTP applications by allowing for an efficient cache implementation that does not diverge from the relational model of the underlying data store.

Speaker image - Alana Marzoev
Alana Marzoev

Founder & CEO @ReadySet

Session

High Performance Time-Series Database Design With QuestDB

In this talk we will explore the world of time series and unique set of problems time series present to the developers. We will discuss the engineering principles behind QuestDB's design, focusing on high performance.

Speaker image - Vlad Ilyushchenko
Vlad Ilyushchenko

Co-Founder & CTO @QuestDB

Session

How Xata Improved the Way Developers Work With Data and Solved Some Tough Problems Along to Way

Validating your code against actual production data can be challenging. We have all been at least once on the receiving end of a "test1" email subject because somebody somewhere did a test with the production database.

Speaker image - Noémi Ványi
Noémi Ványi

Senior Software Engineer @Xata

Speaker image - Simona Pencea
Simona Pencea

Staff Software Engineer @Xata