Sarah Chieng
Activities
historical
Sarah Chieng is a developer advocate and AI engineering practitioner at Cerebras Systems, the hardware company known for its wafer-scale AI chips and ultra-fast inference capabilities.
present
Chieng presented “Fast Models Need Slow Developers” at the AI Engineer conference (May 2026), arguing that ultra-fast inference — Cerebras’ Codex Spark generates code at approximately 1,200 tokens per second versus the typical 40-60 — fundamentally changes the optimal developer workflow. Her thesis: developers formed bad habits around slow AI (huge prompts, one-shot requests, massive commits, running too many agents without verification), and faster models make those habits dangerous by generating bad code faster.
Her practical playbook for fast-model development:
- Use larger, slower models for planning; faster models for execution
- Treat validation as cheap and continuous: tests, linting, pre-commit hooks, diff reviews, browser QA
- Generate many variants quickly, then select the best (cherrypicking)
- Stay in the loop actively rather than delegating and walking away
- Keep diffs bounded and edits small; correct direction early
- Use four external memory files:
agents.md,plan.md,progress.md,verify.md
Connections to other people and companies
- Developer Advocate at Cerebras Systems
- Speaker at AI Engineer conference 2026
Interests
- Fast inference and its implications for developer workflows
- Continuous validation and automated code quality
- Context management and external memory for AI coding
- Making fast AI safe through disciplined workflows
Sources
Source: AI Engineer YouTube — “Fast Models Need Slow Developers — Sarah Chieng, Cerebras” (May 2026).