The past year has seen an absolute explosion in the use of AI and agents in particular, a trend that is guaranteed to accelerate going forward. The sheer usage and amount of agents puts an immense amount of pressure on the underlying cloud infra they run on; it is not uncommon to hear of services creating 10s of millions such sandboxes in a matter of weeks. In the face of such disruptive scale, what can be done to ensure that we don’t keep throwing money at the problem by building more and increasingly larger data centers to cope with the load?

To answer the question, in this talk we’ll cover our years-long journey aimed at severely optimizing and increasing the efficiency of how workloads are deployed on the cloud, beginning with research and OSS work into unikernels (specialized, ultra-efficient virtual machines) and covering the basics of virtualization and isolation primitives. From that basis, we will describe how we leveraged that work to build a cloud platform that can start any workload in a few milliseconds, and cram up to 1M+ such lightweight VMs into a single, off-the-shelf server, allowing for millions of strongly-isolated agents to be hosted in a rack, rather than an entire data center. Finally, we will show a live demo of this in action, including details of a k8s integration that retains these millisecond semantics.

From the same track

Session Deterministic Simulation Testing

A Deterministic Simulation Testing (DST) Journey: From WASM in Go to State Machines in Rust

Monday Mar 16 / 10:35AM GMT

Deterministic simulation testing finds bugs by exploring random execution paths, injecting failures, and letting you replay any failure with a single starting seed.

Alfonso Subiotto

Software Engineer @Polar Signals

Session performance

Understanding and Tuning System Performance with CPU Hardware Counters

Monday Mar 16 / 05:05PM GMT

Counters are fundamental to monitoring: how many requests were processed, how many CPU-seconds consumed, how many bytes sent over a network. Very likely you are already monitoring your applications and operating systems via the hundreds or thousands of counters they expose.

Bryan Boreham

Distinguished Engineer @Grafana Labs, Member of the Prometheus Team, Expert in Distributed Systems and Computer Performance

Session

Use<’lifetimes> For<’what>

Monday Mar 16 / 02:45PM GMT

As Rust has become more ergonomic, lifetimes have become more nuanced.By thinking of lifetimes as sets of loans, rather than using the traditional "regions of code" definition, this talk explores advanced lifetime concepts such as variance and higher ranked lifetimes.

Ethan Brierley

Senior Engineer @TrueLayer and Co-Organiser of Rust London

Session

Unconference: Native Languages

Monday Mar 16 / 11:45AM GMT

Fixing the AI Infra Scale Problem by Stuffing 1M Sandboxes in a Single Server

Abstract

Speaker

Felipe Huici

Find Felipe Huici at:

Speaker

Felipe Huici

Date

Location

Track

Topics

Share

From the same track

A Deterministic Simulation Testing (DST) Journey: From WASM in Go to State Machines in Rust

Understanding and Tuning System Performance with CPU Hardware Counters

Use<’lifetimes> For<’what>

Unconference: Native Languages

Follow QCon

Contact

Menu

Conferences around the World