Abstract
Early AI integrations often start small: wrap an inference API, add a prompt, ship a feature. At Zoox, that approach grew into Cortex, a production AI gateway supporting multiple model providers, multiple modalities, and agentic workflows with dozens of tools, serving over 100 internal clients. The platform was built without heavyweight frameworks, and that was intentional.
This talk is a deep dive into the architecture of Cortex and the primitives that make it work. It covers how a thin inference layer evolved into a multi-tenant platform that handles provider failover, quota management, and the operational realities of serving diverse clients, along with the lessons learned and best practices that emerged from running it in production.
A central concept is “Agents as an API.” Rather than embedding agent logic in clients, Cortex exposes a contract where clients declare the tools they want, and the gateway owns the agent loop, tool invocation, and execution boundaries, informed by what worked and what did not in production. This model supports a range of clients, including the Zoox Intelligence Slack bot, where behavior is configured per channel. Infrastructure activates incident management and deployment tools, while recruiting activates calendar and email tools. Different tools and prompts, one deployed bot, no code changes required.
The platform closes the loop through evaluation. User feedback is captured as structured signals and fed back into the system, enabling continuous assessment of agent behavior and grounding iteration in real usage.
You will leave with a practical blueprint, grounded in lessons learned, for building a multi-tenant AI gateway that scales across teams without scaling complexity.
Speaker
Amit Navindgi
Staff Software Engineer @Zoox
Amit Navindgi is a Staff Software Engineer at Zoox, where he leads Zoox Intelligence — an initiative applying Large Language Models (LLMs) across engineering, operations, customer support, and autonomy. He builds products and platforms that combine technical depth with thoughtful design, creating interactions that are both intuitive to use and elegant to build. His expertise spans Applied AI, Observability, Semantic Search, Experimentation Platforms, Data Engineering, Frontend Development, and Oncall and Incident Management Systems.
He also runs the Zoox AI Hackathon and The Assembly, a cross-functional forum for knowledge sharing and innovation.
Earlier in his career, he developed web applications and distributed systems at Veritas Technologies and focused on Natural Language Processing at the Xerox Research Centre Europe