Building an AI Gateway Without Frameworks: One Platform, Many Agents

Abstract

Early AI integrations often start small: wrap an inference API, add a prompt, ship a feature. At Zoox, that approach grew into Cortex, a production AI gateway supporting multiple model providers, multiple modalities, and agentic workflows with dozens of tools, serving over 100 internal clients. The platform was built without heavyweight frameworks, and that was intentional.


This talk is a deep dive into the architecture of Cortex and the primitives that make it work. It covers how a thin inference layer evolved into a multi-tenant platform that handles provider failover, quota management, and the operational realities of serving diverse clients, along with the lessons learned and best practices that emerged from running it in production.


A central concept is “Agents as an API.” Rather than embedding agent logic in clients, Cortex exposes a contract where clients declare the tools they want, and the gateway owns the agent loop, tool invocation, and execution boundaries, informed by what worked and what did not in production. This model supports a range of clients, including the Zoox Intelligence Slack bot, where behavior is configured per channel. Infrastructure activates incident management and deployment tools, while recruiting activates calendar and email tools. Different tools and prompts, one deployed bot, no code changes required.


The platform closes the loop through evaluation. User feedback is captured as structured signals and fed back into the system, enabling continuous assessment of agent behavior and grounding iteration in real usage.

You will leave with a practical blueprint, grounded in lessons learned, for building a multi-tenant AI gateway that scales across teams without scaling complexity.


Speaker

Amit Navindgi

Staff Software Engineer @Zoox

Amit Navindgi is a Staff Software Engineer at Zoox, where he leads Zoox Intelligence — an initiative applying Large Language Models (LLMs) across engineering, operations, customer support, and autonomy. He builds products and platforms that combine technical depth with thoughtful design, creating interactions that are both intuitive to use and elegant to build. His expertise spans Applied AI, Observability, Semantic Search, Experimentation Platforms, Data Engineering, Frontend Development, and Oncall and Incident Management Systems.

He also runs the Zoox AI Hackathon and The Assembly, a cross-functional forum for knowledge sharing and innovation.

Earlier in his career, he developed web applications and distributed systems at Veritas Technologies and focused on Natural Language Processing at the Xerox Research Centre Europe

Read more
Find Amit Navindgi at:

From the same track

Session

Reliable Retrieval for Production AI Systems

Search is central to many AI systems. Everyone is building RAG and agents right now, but few are building reliable retrieval systems.

Speaker image - Lan Chu

Lan Chu

AI Tech Lead and Senior Data Scientist

Session

Beyond Context Windows: Building Cognitive Memory for AI Agents

AI agents are rapidly changing how users interact with software, yet most agentic systems today operate with little to no intelligent memory, relying instead on brittle context-window heuristics or short-term state.

Speaker image - Karthik Ramgopal

Karthik Ramgopal

Distinguished Engineer & Tech Lead of the Product Engineering Team @LinkedIn, 15+ Years of Experience in Full-Stack Software Development

Session

Refreshing Stale Code Intelligence

Coding models are helping software developers move even faster than ever before, but weirdly, they’re not keeping up with our fast progress. The models that power code generation are often based on months to years old snapshots of open source code.

Speaker image - Jeff Smith

Jeff Smith

CEO & Co-Founder @ 2nd Set AI, AI Engineer, Researcher, Author, Ex-Meta/FAIR

Session

Rewriting All of Spotify's Code Base, All the Time

We don't need LLMs to write new code. We need them to clean up the mess we already made.In mature organizations, we have to maintain and migrate the existing codebase. Engineers are constantly balancing new feature development with endless software upkeep.

Speaker image - Jo  Kelly-Fenton

Jo Kelly-Fenton

Engineer @Spotify

Speaker image - Aleksandar Mitic

Aleksandar Mitic

Software Engineer @Spotify

Session

Sync Agents in Production: Failure Modes and Fixes

As models improve, we are starting to build long-running, asynchronous agents such as deep research agents and browser agents that can execute multi-step workflows autonomously. These systems unlock new use cases, but they fail in ways that short-lived agents do not.

Speaker image - Meryem Arik

Meryem Arik

Co-founder and CEO @Doubleword (previously TitanML)