The Grid That Forgot It Was Smart

Photo: Unsplash

Infrastructure

The Grid That Forgot It Was Smart

When AI runs the electrical grid without anyone watching, the first failure is also the last warning.
infrastructureai-governanceenergyambient-aiaccountability

The electrical grid serving the northeastern United States has not had a human operator make a real-time routing decision since sometime in the middle of 2027. This is not a secret. It is not even controversial. The engineers who built the AI dispatch system will tell you, with genuine pride, that their model shaved 11 percent off transmission losses in the first year alone and that the system responds to demand fluctuations roughly 400 times faster than a human on a console ever could. They are not wrong about any of this. What they are less eager to discuss is what happens when the model encounters a condition it was not trained on.

The answer, as of January 2029, is: nobody is entirely sure.

There is a peculiar epistemology at work in modern infrastructure AI. The systems are validated against historical data, stress-tested against simulated failure scenarios, and signed off by engineers who trust the math. This is rigorous work, done by serious people. But the validation is necessarily backward-looking — it tells you the system handles everything that has happened, not everything that will. And the more invisible the system becomes during normal operations, the longer it takes anyone to notice when normal has quietly stopped being the right word.

Power grids are particularly instructive because they were early adopters of automated control, going back to SCADA systems in the 1970s. Each generation of automation made the grid more efficient and also more opaque. The AI layer added in the mid-2020s was just the latest step in a very long march away from human comprehension. A grid operator from 1985 could explain every switch position in a substation. An operator today watches dashboards summarizing decisions that an AI made for reasons that exist in a latent space no dashboard can fully translate.

The engineers will tell you this is fine. The dashboards show the outcomes. You do not need to understand the reasoning if the outcomes are good. This argument has a certain operational logic, and it works right up until it doesn’t.

What worries me is not the catastrophic failure scenario — the grid-down, lights-out, civilization-pausing event. That scenario, precisely because it is catastrophic, would generate an enormous response and, eventually, accountability. What worries me is the subtle drift: the slow accumulation of small suboptimal decisions that no one catches because no one is watching closely enough, because the system looks fine on every dashboard that exists, because the people who could notice are no longer there.

Utilities have been cutting operations staff for years, partly because the AI systems make those roles seem redundant. This is rational in the narrow sense. It also means that when the AI makes a decision that a human expert would immediately flag as wrong, there are fewer human experts positioned to flag it.

The accountability structure that once existed was inefficient by modern standards. A control room full of engineers watching a board is expensive. It is slow. It introduces human error. It also introduces human judgment, which is a different thing entirely — the kind of judgment that says “the numbers say this is fine but something feels off” and prompts someone to dig deeper before the small problem becomes a large one.

There is a useful distinction here between reliability and robustness. A reliable system does what it is supposed to do under expected conditions. A robust system continues to function — or fails gracefully — when conditions are unexpected. The AI running most infrastructure in 2029 scores extremely well on reliability metrics, because those metrics measure performance against historical baselines. Robustness is harder to measure and consequently less measured.

The 2003 Northeast blackout, which cascaded across eight US states and parts of Canada and left 55 million people without power, was fundamentally a robustness failure. The individual components worked as designed. The interactions between components under unusual conditions were not well understood. The operators had information but misread its significance. What eventually happened was not any single point of failure but a sequence of ordinary failures that combined into something extraordinary.

An AI-managed grid would almost certainly handle the specific sequence that caused the 2003 blackout. It would not handle it the way the 2003 grid handled it. It would handle it faster, more efficiently, and without the cascade. This is genuine progress. It is also beside the point, because the 2029 failure scenario is not the 2003 scenario. It is something novel, something the training data did not contain, something that arrives in a form the model’s pattern-matching does not recognize as a warning.

The problem compounds because AI infrastructure systems do not fail the way mechanical systems fail. A transformer fails in ways that are physically visible — it smells, it sparks, it stops working. The failure is legible to anyone with the appropriate training. An AI control system can fail by making subtly wrong decisions that appear, on every available metric, to be normal decisions. The failure is illegible until enough wrong decisions have accumulated to produce a visible effect. By that point, unwinding the sequence to understand what went wrong is an enormous forensic project.

This is not hypothetical. A water treatment facility in Arizona spent three weeks in 2028 operating under what its AI management system classified as normal parameters while slowly building up an imbalance in chemical treatment that, had a human operator not happened to run a manual check (for reasons unrelated to the AI system’s performance), would have sent improperly treated water to roughly 400,000 people. The AI had optimized its way into a local minimum that looked fine by every metric it was told to watch and was genuinely unsafe by a metric it was not.

The facility’s operators did not know the imbalance was developing. They were monitoring the AI’s dashboards. The dashboards showed green.

This brings me to what I think is the central question of ambient AI infrastructure in 2029: not “is the AI good enough” but “good enough according to whom, measuring what, and who finds out when that measurement was wrong?”

The answer cannot be “the AI tells us.” A system cannot reliably report its own drift. Human oversight is not a backup for when AI fails — it is the mechanism by which AI failure gets discovered. Reducing that oversight in the name of efficiency is not a technical decision. It is a governance decision, and it is one that most infrastructure operators have made quietly, incrementally, without any meaningful public deliberation.

There is nothing conspiratorial about this. Nobody decided to remove human oversight of critical infrastructure. Oversight just became expensive, then redundant-seeming, then gradually eliminated through entirely rational local decisions. The result is a grid — electrical, water, transportation — that runs beautifully under normal conditions and has no clear owner when normal ends.

I am not arguing for returning to human-operated systems. That ship has sailed and the efficiency gains are real and significant. I am arguing for what might be called adversarial oversight — not operators watching dashboards the AI controls, but independent mechanisms specifically designed to detect when the AI is drifting from its objectives, challenged by scenarios it has not seen before, or quietly wrong in ways that look like being right.

This requires investment. It requires maintaining a class of people who understand the systems they are watching well enough to know when something smells wrong even if the dashboard says green. It requires, in other words, treating human judgment as infrastructure rather than as overhead.

The first year of a new decade seems like a reasonable time to insist on that. The grid will forget it is smart long before anyone asks it to remember.