Papers, posts, and books that have shaped how I think about AI in
production engineering. The bottleneck isn't capability - it's trust.
We can build AI systems that perform; we struggle to build systems we
can verify, understand, and safely operate.
Reading trails
Three paths through the material, each ending at one of my essays:
The Bainbridge trail: Bainbridge → Cook → Klein → Simkute et al. → Ironies of Automation. Why automation creates the problems it was meant to solve.
The trust trail: Lamport → Charity Majors → Rudin → Huyen → Observable, Reversible, Enforceable. Building verifiable systems and operational trust.
Gary Klein - Seeing What Others Don't (2013)
How expert insight and pattern recognition function.
Erik Hollnagel & David Woods - Joint Cognitive Systems (2005)
Human-machine combinations form integrated cognitive systems.
Auste Simkute, Lev Tankelevitch et al. - Ironies of Generative AI (2024)
Applies Bainbridge's ironies to LLM coding assistants. Four mechanisms for productivity loss: role shift, workflow disruption, interruptions, hard tasks made harder.
Safety and resilience
Richard Cook - How Complex Systems Fail (1998)
Failure is normal, and safety comes from how systems handle it.
Chip Huyen - AI Engineering (2025)
Foundation model lifecycle: prompt engineering, RAG, fine-tuning, agents, evaluation, and the latency-cost trade-off in production.
AI in practice
Simon Willison - simonwillison.net
Thorough documentation of practical AI usage through TILs and link blogging.
Andrej Karpathy - Software 2.0 (2017)
Code replaced by data, and what that means for programmer roles.
Dan McKinley - Choose Boring Technology (2015)
Weighing novelty costs against stability benefits.