Summary
Disclaimer: This summary has been generated by AI. It is experimental, and feedback is welcomed. Please reach out to info@qconlondon.com with any comments or concerns.
This presentation focuses on designing and operating a global-scale data platform that is ready for AI and spans across multiple regions and cloud providers. The talk highlights strategies for decoupling storage, compute, and AI workloads to enable efficient analytics, vector search, and LLM inference on shared datasets, while avoiding vendor lock-in.
Key Points:
- Decoupling Workloads: Strategies are discussed for separating storage, compute, and AI tasks to improve efficiency and flexibility in using shared datasets.
- Multi-Cloud Strategies: The talk explores the shift from single-cloud to hybrid and multi-cloud setups and the challenges involved.
- Storage Paradigms: The platform supports various storage paradigms including files, object stores, NoSQL, and relational databases.
- Operational Patterns: Patterns for embeddings pipelines, vector indexes, reliability, and disaster recovery are outlined.
- Cost Management: Strategies for managing costs related to data gravity and GPU-intensive AI workloads are shared.
Takeaways: Attendees are expected to leave with a clear understanding of concrete patterns, trade-offs, and pitfalls to avoid when developing AI-centric, multi-cloud data platforms for business-critical applications.
This is the end of the AI-generated content.
Abstract
As organizations move from single-cloud setups to hybrid and multi-cloud strategies, they are under pressure to build data platforms that are both globally available and AI-ready. This talk walks through how to design and operate a global-scale data platform that spans regions and providers, supports multiple storage paradigms (files, object stores, NoSQL, relational), and exposes a clean experience to application teams. We’ll look at how to decouple storage, compute, and AI workloads so analytics, vector search, and LLM inference can run efficiently on shared datasets without creating a new kind of vendor lock-in. Along the way, we’ll cover patterns for embeddings pipelines and vector indexes, approaches for reliability and disaster recovery across regions and failure domains, and cost-management strategies that account for data gravity and GPU-heavy AI workloads. You’ll leave with concrete patterns, trade-offs, and pitfalls to avoid when taking real, messy, business-critical data platforms into an AI-centric, multi-cloud world.
Speaker
George Peter Hantzaras
Engineering Director, Core Platforms @MongoDB, Open Source Ambassador, Published Author
George is a distributed systems expert and a hands-on engineering leader. He is a Director of Engineering at MongoDB, focusing on implementing cloud native technologies at enterprise scale. He is an Ambassador of the Data on Kubernetes community and the author of The Platform Engineering Playbook, by Packt. Most recently, he has been a speaker at global events like Kubecon, OpenSource Summit, Hashiconf, LeadDev, SaaStr, and more.