Bainbridge's Ironies of Automation, forty years on

August 2025

A Hacker News thread sent me to Lisanne Bainbridge's Ironies of Automation, written in 1983. I was expecting historical interest - she was writing about chemical plants and power stations. What I got was an uncomfortably precise description of the AI integration problems I'd been thinking about that same week.

The paper was published in Automatica, an engineering journal. Six pages. Over 2,000 citations. If you work in any field where humans interact with automated systems, read it.

The core insight

Her core insight wasn't "humans make mistakes." It was that automation tends to remove the very work that keeps humans competent, then hands responsibility back to them at the worst possible moment.

She identified several ironies. Designers automate tasks because they consider humans unreliable, but the designers are themselves human, and their errors get baked into the system as latent faults that may lie dormant for years. Operators who no longer perform a task manually lose the skill to perform it, so when the automation fails and they're asked to take over, they're worse at it than they were before the automation existed. And the most automated systems, covering the most edge cases, demand the most operator training. Not less.

Read that again if you work with AI coding tools. It's 1983 and she's describing 2025.

The human-in-the-loop problem

That's starting to show up clearly with AI.

Most "human-in-the-loop" designs I see today are structurally flawed. We let models do the reasoning, the exploration, the pattern-finding, and then ask a human to approve or override the output without having participated in the process that produced it. At that point, the human isn't providing judgment; they're absorbing liability.

This is Bainbridge's monitoring paradox, restated for software engineering. The operator is asked to supervise a process they didn't perform, catch errors in reasoning they didn't follow, and intervene at precisely the moment when the system has already exhausted the easy cases and is failing on the hard ones.

Think about code review for AI-generated PRs. A developer asks Claude to refactor a module. The model produces 200 lines of clean, well-structured code. A reviewer looks at the diff. They didn't write it. They didn't follow the chain of reasoning. They can see what the code does, but they can't easily see what the code should have done and didn't. They approve it because it looks right. That's not review. That's rubber-stamping with extra steps.

The deskilling problem

Bainbridge's skill-degradation argument is the one that should worry engineering leaders most. Physical skills deteriorate when they're not used. Cognitive skills do too.

If a junior engineer spends their first two years generating code with AI and having it reviewed by seniors, what do they actually learn? They learn to prompt well. They learn to read diffs. They may learn to write good tests. But do they build the deep mental model of a system that comes from writing it yourself, debugging it alone when nobody was around to ask, and gradually understanding why the abstractions are shaped the way they are?

I'm not sure. It's too early to know. But the question matters, because those deep mental models are exactly what we rely on when things go wrong - when the system is in a state nobody anticipated and someone needs to reason from first principles about what's happening. If we've automated away the work that builds those models, we've created Bainbridge's irony: we need the skill most precisely when we've done the least to maintain it.

When the code handles money

In financial services, deskilling isn't an abstract concern about engineering craft. It's operational risk with a specific cost.

I run technology for bespoke investment mandates - the kind where each client portfolio has its own constraints, its own reporting requirements, its own regulatory obligations. The systems that manage this are intricate and layered, full of business logic that was never documented well enough because the people who built it understood it and the people who came after learned it by maintaining it. That chain of transmitted understanding is what Bainbridge would call operator competence. It's also the first thing at risk when AI takes over the routine work.

When something breaks during overnight batch processing - a NAV calculation that doesn't reconcile, a regulatory feed that throws an exception nobody's seen before - someone needs to diagnose it quickly. Not just read the stack trace. Understand the data flow, the business rules, the reason this particular client's portfolio is handled differently from every other. If the person on call hasn't built that understanding through years of working inside the system, they're reading logs and guessing.

The SM&CR regime in the UK assigns personal accountability to senior managers for failures in their areas. "The AI generated it" isn't an acceptable answer when a compliance officer asks why a calculation produced the wrong result. You can't delegate understanding to a model.

This is where Bainbridge's ironies bite hardest. The routine work - the bug fixes, the small enhancements, the migration of a calculation from one framework to another - is exactly what builds the mental models people need when things go wrong. Automate that work away and you remove the training ground. The system keeps running. The understanding doesn't compound.

Two directions for the same tool

Tools like Claude Code can go either way.

Used naively, they accelerate deskilling: fewer reps, less context, more opaque decisions. The engineer becomes a prompt-writer who approves outputs. The gap between what the system does and what the human understands grows wider. When something breaks, nobody has the mental model to diagnose it.

Used deliberately, with reusable skills, explicit structure, tests, constraints, and review loops, they can do the opposite: preserve the human mental model while stripping away boilerplate. The engineer still makes the architectural decisions. The AI handles the repetitive implementation. The tests verify the intent. The human stays engaged with the hard parts - the parts that build real understanding.

The leadership challenge isn't how much to automate. It's knowing which cognitive work to protect.

If you can't follow it now, you won't debug it later

Bainbridge wrote that if a monitoring task is too complex for the operator to follow during normal operation, it's unrealistic to expect them to understand it during failure. This translates directly.

If an AI system generates code that the team can't follow during normal development, understanding won't suddenly appear during a production incident. The system won't explain itself. The logs won't map to anyone's mental model. The person on call will be staring at code they've never engaged with, trying to reason about behaviour they've never thought through.

The fix isn't to avoid AI tools. It's to design workflows where humans remain engaged with the decisions that matter. Smaller AI-generated changes that a human can actually review. Tests that encode intent, so when something breaks you know what was supposed to happen. Keeping humans in the reasoning loop, not just the approval loop.

Bainbridge saw this coming forty years ago. We should probably listen.