Sarah Chieng

Activities

historical

Sarah Chieng is a developer advocate and AI engineering practitioner at Cerebras Systems, the hardware company known for its wafer-scale AI chips and ultra-fast inference capabilities.

present

Chieng presented “Fast Models Need Slow Developers” at the AI Engineer conference (May 2026), arguing that ultra-fast inference — Cerebras’ Codex Spark generates code at approximately 1,200 tokens per second versus the typical 40-60 — fundamentally changes the optimal developer workflow. Her thesis: developers formed bad habits around slow AI (huge prompts, one-shot requests, massive commits, running too many agents without verification), and faster models make those habits dangerous by generating bad code faster.

Her practical playbook for fast-model development:

  • Use larger, slower models for planning; faster models for execution
  • Treat validation as cheap and continuous: tests, linting, pre-commit hooks, diff reviews, browser QA
  • Generate many variants quickly, then select the best (cherrypicking)
  • Stay in the loop actively rather than delegating and walking away
  • Keep diffs bounded and edits small; correct direction early
  • Use four external memory files: agents.md, plan.md, progress.md, verify.md

Connections to other people and companies

  • Developer Advocate at Cerebras Systems
  • Speaker at AI Engineer conference 2026

Interests

  • Fast inference and its implications for developer workflows
  • Continuous validation and automated code quality
  • Context management and external memory for AI coding
  • Making fast AI safe through disciplined workflows

Sources

Source: AI Engineer YouTube — “Fast Models Need Slow Developers — Sarah Chieng, Cerebras” (May 2026).