Summary
Disclaimer: This summary has been generated by AI. It is experimental, and feedback is welcomed. Please reach out to info@qconlondon.com with any comments or concerns.
Anderson Parra discusses strategies to handle sudden traffic spikes in systems such as SeatGeek, which can overwhelm even well-designed infrastructure.
- Context: High-demand ticket onsales can lead to extreme pressure on backend services, demanding resilient architecture.
- Challenge: Traditional methods like autoscaling take time, risking system collapse under heavy load.
Architectural Approach:
- Layered shielding architecture distributes responsibilities to manage traffic effectively.
- Edge Mechanisms: Caching, shielding, and admission control mechanisms absorb traffic bursts.
- API Gateways: Enforces fairness through rate limiting and validation.
- Kubernetes Policies: Control networking to protect service boundaries and manage failures.
Principles for Resilience:
- Observe Early: Systems should react to pressure signals, not just failures.
- Control Failures: Use rate limits and queues to manage entry and load.
- Protect the Core: Prioritize critical services to ensure core system survival.
- Continuous Adaptation: Systems must adapt through observability and feedback signals.
Implementation Strategies:
- Utilize signal-based control mechanisms to adjust rate limits and identify abusive actors.
- Use a feedback loop to dynamically adapt system behavior based on observed pressures.
Concluding Remarks: Traffic spikes are inevitable, but collapse is a design choice. Resilient systems should focus on early signals and controlled failures to maintain stability under load.
This is the end of the AI-generated content.
Abstract
High-demand events can cause sudden traffic spikes that overwhelm even well-designed systems. In ticketing platforms, millions of users — alongside increasingly sophisticated automated agents — may arrive simultaneously, placing extreme pressure on backend services.
At SeatGeek, we observed that even elastic infrastructure has limits: autoscaling takes time to react, and systems must survive while capacity catches up. To address this gap, we designed a layered shielding architecture that distributes defensive responsibilities across multiple parts of the platform.
At the edge, caching, shielding, and admission control mechanisms such as queueing absorb traffic bursts before they reach the origin. API gateways enforce fairness through rate limiting and request validation. Deeper in the stack, Kubernetes-native networking policies and platform controls help contain failures and protect service boundaries.
This layered approach allows the system to shed load early, protect critical services, and degrade gracefully during extreme demand. But resilience is not static: traffic patterns evolve, new bottlenecks emerge, and systems must continuously adapt through observability and feedback signals.
In this talk, we will explore the architecture and operational lessons behind building multi-layer shields that protect core systems under internet-scale traffic, and share practical insights for designing resilient platforms that can withstand traffic stampedes without bringing down the entire ecosystem.
Interview:
What is your session about, and why is it important for senior software developers?
This talk explores how to design resilient systems that can withstand extreme traffic spikes without collapsing. Using real-world examples from ticketing platforms, I will show how distributing defensive responsibilities across layers — edge, gateway, and platform infrastructure — helps protect critical services during sudden demand surges. Senior engineers often operate systems where scaling alone is not enough; resilience requires intentional architecture and operational controls. The session focuses on practical patterns that help systems degrade gracefully rather than fail catastrophically.
Why is it critical for software leaders to focus on this topic right now, as we head into 2026?
Traffic patterns are becoming less predictable as automated agents, AI-driven clients, and global user demand increase system pressure. At the same time, modern platforms rely on complex distributed architectures in which small failures can quickly cascade. Leaders need to design systems that assume sudden spikes and evolving traffic behavior. Building resilience through layered defenses and clear operational signals is becoming essential for maintaining reliability at scale.
What are the common challenges developers and architects face in this area?
A common misconception is that cloud elasticity alone solves scalability problems. In reality, autoscaling takes time to react, and systems often experience instability before capacity catches up. Teams also struggle to identify truly critical services, manage noisy-neighbor effects in shared infrastructure, and detect early signals of system stress. Designing architectures that shed load early and protect the core system requires coordination across multiple platform layers.
What's one thing you hope attendees will implement immediately after your talk?
I hope attendees rethink where traffic control happens in their systems. Instead of relying solely on backend scaling, they should introduce earlier defenses — such as caching, admission control, and rate limiting — to absorb pressure before it reaches core services. Even small changes at the edge or gateway layer can dramatically improve system stability during traffic spikes.
What makes QCon stand out as a conference for senior software professionals?
QCon focuses on real engineering experience rather than hype or vendor-driven content. Speakers share lessons learned from operating large-scale systems in production, including the trade-offs and failures behind architectural decisions. This creates an environment where senior engineers can learn from peers facing similar challenges. The emphasis on practical insight and honest technical discussion makes QCon particularly valuable.
Speaker
Anderson Parra
Staff Software Engineer @SeatGeek
Anderson Parra is a Staff Software Engineer on SeatGeek’s Cloud Platform team, where he works on the infrastructure that powers high-demand ticket onsales. His work focuses on building resilient systems that can withstand internet-scale traffic and on designing layered defenses across edge, API gateways, and Kubernetes platforms to protect core services while preserving a fair user experience.
Over the past 18+ years, Ander has built and operated large-scale distributed systems handling massive traffic and data volumes for companies in Brazil, Ireland, Germany, the UK, and the United States. He has worked across a wide range of technologies, including Java, Scala, Go, Ruby, Python, JavaScript, and Lua, with a strong focus on platform engineering and distributed systems.
Anderson holds a master’s degree in distributed systems based on his research, “A Lightweight Reconfiguration Solution for Paxos”.