Abstract
Conversational AI doesn’t fail in the lab. It fails in production. The moment real users arrive, the combinatorial explosion of language, ambiguity, and intent exposes a hard truth: traditional QA, flow-based design, and manual testing are structurally incapable of preparing AI systems for reality. What looks stable in staging quickly degrades under long-tail behavior, creating silent failure modes that directly impact revenue, trust, and compliance. This session examines why common product assumptions collapse in AI interfaces and why even leading companies struggle to maintain reliability at scale. It introduces Generated Simulation — a systematic approach to simulating user behavior at production scale — enabling teams to quantify risk, surface blind spots, and transform unpredictability into actionable product insight.
Topics To Be Covered
Why AI fails in production
Hidden long-tail behavior risks
Limits of traditional AI testing
Quantifying AI reliability at scale
Simulating real user chaos
Perfect For
AI Product Leaders
Heads of AI
LLM Engineers
Risk & Governance Leaders
CTOs & CIOs
Meet Your Speaker
Shahar Erez

Co-founder & CEO, Arato.ai
Shahar Erez is a visionary leader with 25 years of experience in enterprise B2B, transitioning from software engineering to leading product, go-to-market, and strategy initiatives. He played key roles in developing Mercury Interactive’s APM platform (acquired by HP) and VMware’s Cloud Application product suite, while also leading investments in DevOps companies like Puppet Labs and JFrog. Shahar co-founded TheOrg and Stoke (acquired by Fiverr) before founding Arato, where he continues to innovate in AI and technology.
ADDITIONAL INFORMATION
Time & Place
Wed, March 25
13:15 - 13:30
Grand Ballroom II
Roundtables & Theatre Seating
Max. Capacity: 200 Seats
Secure your seat – registration required.
Notes
Agenda for this session
10 min presentation

.png)