Abstract

Queues are an essential component in a scalable distributed system, but going beyond the simple implementation creates an explosion of complexity to manage.

Suddenly there are 5 different places where you could be creating bottlenecks and you might not even notice until your customer tells you things are running slow or you're drowning in dashboards.

In this presentation we’ll share our experience and expertise in operating, debugging and managing queueing systems with a focus on:

Why distributed tracing is a requirement for successful operations with queues
How OpenTelemetry standards make it easy to bring distributed tracing to your systems
How distributed tracing helped Gearset take control of our queueing problems
How we dealt with distributed sampling, total operation duration and dealing with long running traces

Speaker

Julian Wreford

Team Lead of Operability Team @Gearset, Software Engineer Turned Accidental SRE

Julian Wreford is an engineering team lead at Gearset where he leads the team responsible for all things site reliability. After starting as a developer, he quickly became interested in operability and has helped lead the growth of observability culture and incident response at Gearset as the company has scaled from small teams to large enterprises. He is passionate about developer ownership throughout the software lifecycle and enjoys empowering developers to better understand and debug the code they write when it is running at scale.

Speaker

Oli Lane

Engineering Team Lead @Gearset, Focusing on Engineering Culture, Observability, and Platform Reliability

Oli is an Engineering Team Lead and self-described "Jack of at least some trades." A fixture at Gearset for over ten years, he has ridden the wave from a scrappy 7-person startup to a 350+ employee scale-up.

Along the way, he has gained deep experience across both product and infrastructure teams, with a particular interest in the sociotechnical side of engineering. Currently, Oli focuses on platform engineering and observability, building the culture and tools needed for high-performing teams and reliable systems.

Find Oli Lane at:

From the same track

Session architecture

From Fan-Out to Fast: Sub-100ms API Design in Distributed Systems

Monday Mar 16 / 10:35AM GMT

A “simple” API request rarely stays simple. In distributed systems, one call quickly turns into fan-out across gateways, services, caches, and databases — and your p99 becomes the sum of every hop and every flaky dependency.

Saranya Vedagiri

Senior Staff Engineer @eBay

Session Platform Engineering

APIs for Agents: Rethinking API Programs in the MCP Era

Monday Mar 16 / 11:45AM GMT

As API programs mature, a familiar gap emerges: some teams operate with strong standards, reusable platforms, and clear governance,  while others rely on informal guidance and best-effort consistency.

Jim Gough

Distinguished Engineer, API Platform Lead Architect @Morgan Stanley, Co-Author of Optimizing Java

Andreea Niculcea

Vice President @Morgan Stanley

Session

Beyond the Dashboard: Why 'Query-ability' is the New Observability

Monday Mar 16 / 01:35PM GMT

Details coming soon.

Session

Async-First: Architecting for Event-Driven Connectivity

Monday Mar 16 / 05:05PM GMT

Details coming soon.

Session

Unconference: Connecting Systems

Monday Mar 16 / 02:45PM GMT

Uncorking Queueing Bottlenecks with OpenTelemetry

Abstract

Speaker

Julian Wreford

Find Julian Wreford at:

Speaker

Oli Lane

Find Oli Lane at:

Speaker

Julian Wreford

Speaker

Oli Lane

Date

Location

Track

Share

From the same track

From Fan-Out to Fast: Sub-100ms API Design in Distributed Systems

APIs for Agents: Rethinking API Programs in the MCP Era

Beyond the Dashboard: Why 'Query-ability' is the New Observability

Async-First: Architecting for Event-Driven Connectivity

Unconference: Connecting Systems

Follow QCon

Contact

Menu

Conferences around the World