Fixing the AI Infra Scale Problem by Stuffing 1M Sandboxes in a Single Server

Abstract

The past year has seen an absolute explosion in the use of AI and agents in particular, a trend that is guaranteed to accelerate going forward. The sheer usage and amount of agents puts an immense amount of pressure on the underlying cloud infra they run on; it is not uncommon to hear of services creating 10s of millions such sandboxes in a matter of weeks. In the face of such disruptive scale, what can be done to ensure that we don’t keep throwing money at the problem by building more and increasingly larger data centers to cope with the load?

To answer the question, in this talk we’ll cover our years-long journey aimed at severely optimizing and increasing the efficiency of how workloads are deployed on the cloud, beginning with research and OSS work into unikernels (specialized, ultra-efficient virtual machines) and covering the basics of virtualization and isolation primitives. From that basis, we will describe how we leveraged that work to build a cloud platform that can start any workload in a few milliseconds, and cram up to 1M+ such lightweight VMs into a single, off-the-shelf server, allowing for millions of strongly-isolated agents to be hosted in a rack, rather than an entire data center. Finally, we will show a live demo of this in action, including details of a k8s integration that retains these millisecond semantics.


Speaker

Felipe Huici

CEO and Co-Founder @Unikraft, Founder and Maintainer of the Linux Foundation Unikraft Open Source Project

Felipe is CEO and Co-Founder of Unikraft, a start-up building the next generation cloud platform. Prior he worked as chief researcher at NEC Laboratories Europe, has published in several top tier conferences such as SOSP, ASPLOS, OSDI, Eurosys, SIGCOMM, NSDI and CoNEXT, and has given talks at Open Source Summit, P99 and QCon, among others. Finally, Felipe is one of the founders and maintainers of the Linux Foundation Unikraft open source project.

Read more
Find Felipe Huici at:

From the same track

Session Deterministic Simulation Testing

A Deterministic Simulation Testing (DST) Journey: From WASM in Go to State Machines in Rust

Monday Mar 16 / 10:35AM GMT

Deterministic simulation testing finds bugs by exploring random execution paths, injecting failures, and letting you replay any failure with a single starting seed.

Speaker image - Alfonso Subiotto

Alfonso Subiotto

Software Engineer @Polar Signals

Session WebAssembly

Designing Language-Agnostic Plugin Systems With Webassembly and Extism

Monday Mar 16 / 01:35PM GMT

Imagine a world where anyone could write plugins/extensions in any languages that interop with the application, regardless of your stack. Extism makes that real by using WebAssembly.

Speaker image - Shivay Lamba

Shivay Lamba

Developer Experience Engineer @Qualcomm, Google Summer of Code Admin @Jenkins

Session

Rust in the Real World

Monday Mar 16 / 03:55PM GMT

Details coming soon.

Session

Go in the Real World

Monday Mar 16 / 02:45PM GMT

Details coming soon.

Session

Unconference: Native Languages

Monday Mar 16 / 11:45AM GMT