Apache Iceberg: Powering the Future of Modern Data Architectures

As organizations scale their data platforms, traditional data lake architectures often struggle with issues such as data consistency, schema evolution, performance, and governance. Apache Iceberg is an open table format designed to address these challenges, enabling scalable, efficient, and reliable data management. This session explores how Iceberg enhances modern data architectures by providing ACID transactions, time travel, branching and tagging and schema evolution while optimizing query performance across multiple engines like Spark, Trino, and Flink.

Additionally, we will examine how Apache Iceberg can be leveraged across different cloud providers, making it a powerful choice for companies with data lakes distributed across multiple cloud storage systems without getting locked in. By decoupling storage from compute and providing a consistent table format, Iceberg enables seamless interoperability, cost optimization, and simplified data governance. Attendees will gain insights into how Iceberg improves data reliability, reduces operational complexity, and unlocks new possibilities for analytics and machine learning in multi-cloud environments.

Finally, we will provide an overview of the upcoming Apache Iceberg version 3 and its new features, offering a glimpse into the future of this evolving technology.


Date

Wednesday Apr 9 / 03:55PM BST ( 50 minutes )

Location

Fleming (3rd Fl.)

Share

From the same track

Session Data Architecture

Reliable Data Flows and Scalable Platforms: Tackling Key Data Challenges

Wednesday Apr 9 / 10:35AM BST

There are a few common and mostly well-known challenges when architecting for data. For example, many data teams struggle to move data in a stable and reliable way from operational systems to analytics systems.

Speaker image - Matthias Niehoff

Matthias Niehoff

Head of Data and Data Architecture @codecentric AG, iSAQB Certified Professional for Software Architecture

Session Data engineering

Building a Global Scale Data Platform with Cloud-Native Tools

Wednesday Apr 9 / 01:35PM BST

As businesses increasingly operate in hybrid and multi-cloud environments, managing data across these complex setups presents unique challenges and opportunities. This presentation provides a comprehensive guide to building a global-scale data platform using cloud-native tools.

Speaker image - George Hantzaras

George Hantzaras

Engineering Director, Core Platforms @MongoDB, Open Source Ambassador, Published Author

Session

Achieving Precision in AI: Retrieving the Right Data Using AI Agents

Wednesday Apr 9 / 11:45AM BST

In the race to harness the power of generative AI, organizations are discovering a hidden challenge: precision.

Speaker image - Adi Polak

Adi Polak

Director, Advocacy and Developer Experience Engineering @Confluent, Author of "Scaling Machine Learning with Spark" and "High Performance Spark 2nd Edition"

Session

Streaming into the Future: Architecting for Real-Time Data Insights

Wednesday Apr 9 / 02:45PM BST

Details coming soon.