Ensure AI Agents Work Evaluation Frameworks for Scaling Success — Aparna Dhinkaran, CEO Arize



AI Summary

This session provides an executive-level perspective on evaluating AI agents at scale, focusing on practical strategies for designing evaluation processes that drive measurable impact. It highlights the importance of identifying performance bottlenecks and implementing observability practices to maintain reliability over time. The talk discusses common pitfalls, best practices for improvement, and the transformation of experimental agents into enterprise-grade solutions. Ideal for those shaping their organization’s GenAI strategy or looking to maximize the potential of AI agents.