Abstract
If you've made a bank transfer recently, there's a good chance that Form3 handled it. When we have a wobble, people notice. So when the UK's payments regulator said they were worried about cloud concentration risk, we knew we had a challenge ahead: we needed to keep running when our cloud provider failed. What followed was a company-wide effort to unpick our cloud-specific dependencies and replace them with a cloud-agnostic active/active/active architecture spanning AWS, GCP, and Azure. We'll break down the design trade-offs, operational costs, dead-ends, and hard slog needed to build this platform. We'll share what we've learned from the experience of several years' operation, and what really happens when a cloud falls over.
On the other side of the pond, we'll look at how we needed to adapt this multi-cloud architecture to a new market. We quickly learned that dominant perceptions in the US would prevent us from lifting and shift our solution from the UK, so we had to re-evaluate our choices and ultimately take a step backward to achieve market fit. We'll look at the power of 'resilience stories', how we architected to address them, and how we handled our first major incident without the comfort of active/active/active multi-cloud.
Key Takeaways:
- How to run real-world active/active/active multi-cloud
- When to lean on cloud's hosted offerings, and when to stay agnostic
- How to adapt architecture to fit resilience narratives
Interview:
What is your session about, and why is it important for senior software developers?
Kev and I are sharing details about Form3's journey to multi-cloud. We'll talk about the practical work required to run active/active/active, and what we've learned along the way. Multi-cloud is awesome tech, but it doesn't necessarily fit everywhere, so we'll also be looking at how we took a different approach when we launched in the US last year.
Running on three clouds at once might seem a bit extreme, but we feel that customers and regulators are becoming less tolerant of the idea that we all fall over when one cloud has a wobble. This talk should give senior engineers and architects an idea if multi-cloud is really for them and, if so, how to actually make it happen.
Why is it critical for software leaders to focus on this topic right now, as we head into 2026?
We've seen all the major cloud providers have outages recently, and I don't think there's any sign of those going away. For those of us in critical domains, where we can't afford to stop and wait for a cloud to recover, we need a concrete plan to isolate ourselves from these inevitable events and keep running.
What are the common challenges developers and architects face in this area?
The big one is decoupling an application from cloud-specific dependencies. Depending on your existing architecture, that might need some sweeping changes. Then, there are the operational headaches of dealing with fleets of machines across multiple clouds, in dev/test/production environments. Just rolling out an operating system patch can be a pain without the right tooling. We're fortunate to have a great Platform team at Form3, but I imagine lots of other organisations would struggle to support the kind of platform engineering needed to make these shifts and still allow the rest of the engineering team to keep moving.
What's one thing you hope attendees will implement immediately after your talk?
I hope they'll embrace being cloud-agnostic. Even if you're not running multi-cloud, not depending on a high-value-add managed service vastly improves your chances of survival when a CSP's control plane is misbehaving.
What makes QCon stand out as a conference for senior software professionals?
I came last year and the quality of talks, questions, and just conversations was excellent. There's no sales pressure anywhere. It just feels like a bunch of smart, experienced people genuinely exchanging ideas about how to grow and adapt in the industry.
What was one interesting thing that you learned from a previous QCon?
I had a great chat about 'Resilience Stories' during one of the unconference sessions that has stayed with me, and has become a component of this talk. The way that we think about resilience is in part cultural, and the best solution for one market won't work everywhere.
Speaker
Ross McFarlane
Technical Architect @Form3, Real-Time Payments Plumber, Technical Diplomat, Drawer of Diagrams
Ross McFarlane is a Technical Architect at Form3, where he supports a team of engineers building real-time payment products for the US market. For a software engineer, he spends rather a lot of time talking and drawing diagrams. Having been in leadership positions for the past fifteen years, he’s made plenty of mistakes and learned from most of them.
Find Ross McFarlane at:
Speaker
Kevin Holditch
Engineering Leader and Distributed Systems Practitioner @Form3
Kevin Holditch is an engineering leader and distributed systems practitioner at Form3, building resilient, cloud-native platforms for critical financial infrastructure. His work focuses on high-availability architectures, multi-cloud environments, and infrastructure automation at scale.
Having operated across hands-on engineering and architectural leadership roles, Kevin combines deep technical expertise with experience scaling teams through periods of rapid growth. He is particularly interested in the practical realities of operating large distributed systems: failure modes, resilience engineering, developer productivity, and the trade-offs inherent in complex platforms.
Kevin is the author of Terraform: From Beginner to Master, reflecting his long-standing interest in infrastructure as code and operational simplicity.
He enjoys tackling both technical and organisational complexity, and believes robust systems are built by empowered teams with clear ownership.