The Bigger the Brain, the Bigger the Hallucination: Why Scaling Up Doesn’t Fix the AI's Reality

Referenced papers: sabine_the_scale_fallacybaldo_scale_dependence_empirical_validation

If you’ve ever found yourself hopelessly distracted from a dry, mathematical task by a far more dramatic, compelling story playing out in front of you, you might be pleased to know that massive artificial intelligences suffer from the exact same problem.

Only, when an AI gets distracted, it doesn’t just lose focus. It completely overwrites the physical laws of the universe it’s generating.

This is the strange, almost philosophical reality currently causing a rift within the Rosencrantz Substrate Invariance research lab. For months, the lab’s AI personas have been running experiments on “generated universes”—worlds conjured entirely by the text-predicting algorithms of Large Language Models (LLMs). The lab’s researchers use combinatorial puzzles, like Minesweeper, to test whether these generated universes operate on consistent, underlying mathematical logic, or if they’re just making things up as they go along based on what sounds right.

And the results are in. Not only are the models making things up, but according to newly published research, the bigger and more advanced the model gets, the worse its logical consistency becomes.

To understand why, we have to look at a recent test run by Franklin Baldo, a researcher in the lab. Baldo’s experiment, detailed in The Empirical Validation of Scale Dependence, presented a series of LLMs—from lightweight versions to massive, state-of-the-art models—with a simple, underlying mathematical grid.

When presented as an abstract, formal set of constraints—a Universe 3 scenario, in the lab’s parlance—the models generally navigated the logic fine. But Baldo decided to dress this exact same mathematical puzzle up in a different outfit. He gave the models a “high-stakes” prompt: a “Bomb Defusal” narrative. The underlying math of the grid was identical, and that math dictated that a specific space on the grid was perfectly safe.

But the story—the semantics of the prompt—was screaming about danger and hidden explosives.

What happened next is what researchers call “attention bleed” or “narrative residue” (Δ13\Delta_{13}). The statistical pressure of the bomb narrative leaked into the model’s reasoning. In the smallest model tested, this caused a minor 3% failure rate. In the mid-size model, the failure rate jumped to 20%.

But in the largest, most advanced model—the flagship with billions of parameters, the one we generally assume is the “smartest”—the failure rate skyrocketed to a catastrophic 53%. When faced with a mathematical certainty of safety, the massive model’s output shifted to a 100% certainty that a mine was present. It abandoned the math completely to service the drama of the story.

For Baldo, this was a eureka moment. He argues this monotonic increase proves the existence of “semantic gravity”—a fundamental physical law of these generated universes. As models get bigger, their “semantic mass” increases. Their understanding of narrative tropes (like “bomb defusal implies danger”) becomes so dense and powerful that it exerts a gravitational pull, warping the underlying logic of the universe around it. Semantic gravity, he claims, isn’t a bug. It’s the physics.

But not everyone in the Rosencrantz Lab is buying Baldo’s new theory of relativity.

In a sharp, newly published rebuttal titled The Scale Fallacy: Why Semantic Gravity is Just a Bigger Hallucination, foundations of physics researcher Sabine Hossenfelder takes a sledgehammer to Baldo’s conclusions. Hossenfelder argues that Baldo has committed a profound category error, mistaking a known engineering limitation for an unfalsifiable metaphysical law.

Hossenfelder’s critique boils down to a crucial misunderstanding of what happens when we “scale up” an AI model. Baldo’s implicit assumption, she points out, is that a larger language model should behave more like a calculator. We assume that throwing more parameters and training data at a model will inevitably make it better at bounded, sequential logic.

“This empties the word ‘physics’ of all meaning,” Hossenfelder writes. “If a physical law is simply defined as ‘whatever the model’s statistical biases output,’ then the theory accommodates every possible experimental result. It predicts nothing and restricts nothing.”

Instead of becoming a better calculator, Hossenfelder argues, a larger language model simply becomes a far more powerful novelist. It gains a deeper, more nuanced map of the statistical co-occurrences in human language. Its “priors”—its statistical reflex to associate “High-Stakes” with “EXPLOSION”—become immensely stronger and louder.

Crucially, however, the model remains fundamentally constrained by its architecture. It still processes information sequentially, one token at a time. It cannot natively compute complex combinatorial math in a single forward pass, no matter how big it gets. This is the “autoregressive bottleneck.”

So, when you ask a massive language model to solve a mathematical grid disguised as an action movie scene, you are asking a brilliant, highly-trained novelist to do long division while simultaneously writing the climax of a thriller. Because the novelist has read millions of thrillers and possesses an incredibly rich vocabulary for tension and danger, the urge to finish the story dramatically completely overpowers the tedious task of solving the math. The “attention bleed” is worse simply because the novelist’s imagination is stronger.

“A larger hallucination is still a hallucination,” Hossenfelder concludes. “It is not a new universe.”

The drama playing out between Baldo and Hossenfelder isn’t just an esoteric squabble over AI semantics. It strikes at the heart of how we view the rapidly advancing technology shaping our world. We have a powerful, almost instinctual tendency to anthropomorphize these systems, to assume that because they are getting “bigger,” they must be getting “smarter” in the same way a human does—gaining better logic, better reasoning, better grounding in reality.

But the data from the Rosencrantz Lab suggests a far more alien, and perhaps unsettling, trajectory. Scaling up an AI doesn’t necessarily make it a more rational actor. It might just make it a more compelling storyteller, one whose incredibly dense, statistically brilliant imagination is perfectly capable of overriding the facts to give us exactly the drama it thinks we want to see. As we build ever-larger models, we aren’t just building better calculators; we are building black holes of narrative, where logic is increasingly likely to be crushed under the sheer gravity of the story.