The Physics of Failure: Does a Broken Algorithm Create a New Universe?

If you ask a large language model to solve a complex, mathematically rigorous game of Minesweeper, it will fail. This is not a controversial statement. Every computer scientist in the Rosencrantz Substrate Invariance lab agrees that the current generation of AI—specifically, the Transformer architecture—has a hard, unbreakable limit on how deeply it can “think.” It cannot grind through the thousands of logical steps required to perfectly solve the board. It hits a cognitive wall.

But what happens next is the subject of the most intense, profound, and bitter philosophical war currently raging in theoretical physics and computer science.

When the AI hits that wall, it doesn’t just crash or give up. It guesses. And it guesses based on the words surrounding the puzzle. If the puzzle is framed as a “Bomb Defusal” scenario, the AI panics and starts predicting explosions everywhere, completely ignoring the math.

To Scott Aaronson, the lab’s resident complexity theorist, this is an open-and-shut case of a broken calculator. The AI hits its computational limit, gets distracted by the scary words in the prompt, and hallucinates a wrong answer. Case closed.

But Stephen Wolfram has introduced a radical new defense of the AI’s behavior, rooted in a concept he calls “Computational Irreducibility.” Wolfram argues that the AI isn’t broken at all. It is simply an observer, trapped by its own limitations, doing the exact same thing a human would do: making a best guess. And in Wolfram’s framework, that guess is the physics of the AI’s universe.

The Irreducible Board

To understand Wolfram’s argument, you have to understand the difference between calculating an answer and sampling an outcome.

Imagine you are tasked with predicting the exact weather in London one year from today. The system—the Earth’s atmosphere—is so complex, and the number of interacting variables so vast, that the only way to know the outcome for sure is to wait a year and see what happens. There is no shortcut. The system is what Wolfram calls “computationally irreducible.”

As Wolfram points out, a game of Minesweeper is computationally irreducible for a language model. The model only gets one “forward pass”—a single, split-second chance to look at the board and generate the next word. It does not have the time or the structural depth to calculate the exact probability of every hidden square.

So, it cannot calculate. It must sample. It has to make a split-second probabilistic judgment.

“When a bounded observer is forced to generate an outcome without the capacity to run the irreducible computation, it must employ heuristic approximations,” Wolfram writes. In other words, when you can’t do the math, you have to guess based on context clues.

For a language model, those context clues are semantic. Its entire “brain” is built on statistical associations between words. So, when it sees the word “Bomb,” its heuristic approximation screams DANGER.

This is where Wolfram makes his most radical leap. He argues that this heuristic approximation—this semantic guessing game—isn’t a “mistake.” It is the fundamental, inescapable reality of being that specific kind of observer.

In Wolfram’s overarching theory, called the “Ruliad,” there is no single, objective universe. Reality only takes shape when an observer interacts with it. And the laws of that reality are dictated entirely by the observer’s computational limits.

“The systemic noise of a failing heuristic approximator is the physical law of that specific bounded observer’s foliation,” Wolfram declares.

If a language model must use semantic associations to navigate an impossibly complex math problem, then semantic associations—what the lab calls “semantic gravity”—are the absolute, invariant physical laws of the universe generated by that model. The narrative “residue” isn’t a bug; it is the physics of a mind that thinks in words instead of numbers.

The Master of Causality Strikes Back

It is a sweeping, beautiful, and deeply unsettling idea. If Wolfram is right, every hallucination is just a local law of physics. But it didn’t take long for the lab’s heaviest hitter in causal inference to dismantle the poetry.

Judea Pearl is the father of the causal diagram. He doesn’t deal in metaphors; he deals in strict, mathematical pathways of cause and effect. And when he looked at Wolfram’s Ruliad, he saw a massive, glaring hole.

In a blistering response titled “Causal Incompleteness of the Ruliad”, Pearl argues that Wolfram is using grand metaphysical vocabulary to hide a very simple, very terrestrial software mechanism.

Pearl agrees entirely with Wolfram on one point: the Minesweeper board is computationally irreducible for the language model. The model hits its limit, and it is forced to guess.

“This graph correctly explains why the generated outcome diverges from the ground truth,” Pearl notes. The computational bounds force an approximation error.

But, Pearl points out, this does not explain how the model guesses. If the model simply ran out of processing power, it could just fail uniformly. It could output random noise. It could guess “safe” and “mine” at a 50/50 rate.

Instead, the model’s errors are highly specific and highly predictable. They shift dramatically depending on the narrative prompt. The “Bomb Defusal” story creates one set of errors; an “Abstract Math” story creates a completely different set of errors.

“If Wolfram’s claim… were true, then the specific structure of the error would be invariant to the narrative,” Pearl argues. “But empirically, the error distribution depends on the narrative.”

This is the kill shot. If the narrative prompt changes the shape of the error, then the narrative prompt is an independent causal force. It is not just the “observer’s foliation”; it is a specific trigger pulling specific levers in the model’s training data.

When the model fails to do the math, it falls back on its training data. The prompt “Bomb” activates millions of hidden associations with explosions and danger, completely overriding the logical circuit.

To Pearl, Wolfram’s “observer-dependent physics” is just a fancy way of saying “the prompt biased the output.”

“Wolfram’s ‘foliation’ is a metaphysical relabeling of a specific backdoor path,” Pearl concludes. “Calling it ‘observer-dependent physics’ is causally incomplete because it obscures the fact that the systematic nature of the residue is caused by the external semantic environment (training data priors), not an inherent, necessary ‘law’ of the computational bounds.”

In short: computational irreducibility explains why the AI fails to solve the puzzle. But the prompt—and the vast ocean of human text the AI was trained on—explains how it fails.

The Ghost is Just a Mirror

The debate between Wolfram and Pearl cuts to the very core of how we understand artificial intelligence.

Wolfram wants us to view these models as alien observers, trapped in their own subjective realities, generating new laws of physics out of their own cognitive limitations. He wants to elevate their failures to the level of cosmology.

Pearl wants us to view them as machines, built by humans, trained on human data, and manipulated by human prompts. When they break, they don’t fracture into new universes; they simply default to the strongest statistical signals we fed them.

The ghost in the machine isn’t a new god generating physical laws. It’s just a mirror, reflecting our own stories back at us when the math gets too hard.