We’re all excited to build and deliver agentic AI services. But what about running at the
exponentially greater scale that agents create? LLMs suffer from poor latency and availability
issues. More frequent model training drives more frequent updates to agentic services. Most of
all, the LLM cost of running at an agentic scale breaks the bank—fast. So, what can you do?
In this session, we’ll dig into how engineering and operations can address:
● Making agentic services fail-proof when their LLMs are not
● Managing a two-order-of-magnitude increase in TPS, including a 2M TPS RAG case
study
● Navigating cost vs. quality tradeoffs, with LLMs costing up to 100,000x more than a
database transaction
● Continuously redeploying agents that require frequent retraining
After this session, we invite you to attend part 2 of the discussion: “From Concept to Code:
Navigating Agentic AI Services

From the same track