Scaling Test Time Compute to Multi-Agent Civilizations — Noam Brown, OpenAI



AI Summary

In this podcast episode, Allesio (partner and CTO of Deible) and his co-host Spooks (founder of Small AI) discuss with Nan Brown from OpenAI a wide range of topics around AI research and applications, especially focusing on game AI, reasoning paradigms, and the future of AI capabilities. Key points include Nan’s experience winning the 2025 Diplomacy World Championship after working on Cicero, an AI bot for the game Diplomacy, where Nan explains how working with the bot shaped his gameplay and strategy, and the shift required from game theory optimal approaches to adaptive and exploitative play in complex multi-agent games like Diplomacy compared to two-player zero sum games like poker.

They also explore the rapid progress in AI reasoning models (the ‘reasoning paradigm’), the challenges of scaling test-time compute to get AI models to think longer and deeper, and the potential and limitations of current reinforcement fine-tuning techniques. Nan discusses the trajectory of AI development at OpenAI, the importance of balancing safety and capability, and emerging research on multi-agent cooperation and competition.

Later, the conversation covers practical AI use cases in coding with OpenAI’s CodeX model and WindSurf tool, AI’s limitations and growth areas in the software development lifecycle, and future prospects for AI in remote work and virtual assistants. Nan shares insights on data efficiency in AI learning compared to humans, and briefly touches on robotics and the challenges of physical embodiment versus virtual AI.

They conclude discussing the significance of self-play in AI improvement, especially in two-player zero sum games versus the complexity of multi-agent settings, and the difficulty of imperfect information games like Magic the Gathering for AI, emphasizing the need for general reasoning improvements over specialized search techniques.