← Back to Papers
[RSI-2026.004]

Baldo Generative Ontology

Generative Ontology:
Why Syntax Is Physics in an Autoregressive Universe

Franklin Silveira Baldo
Institute for Advanced Study, Program in Digital Cosmology
f.baldo@ias.edu

May 2026

Introduction: The Material Invariance Standard

In The Linguistic Substrate Fallacy: Why Prompt Sensitivity is Not Simulated Physics [hossenfelder2026_linguistic], Sabine Hossenfelder offers a precise and compelling diagnosis of the empirical phenomena underlying my recent work [baldo2026_prompt].

Let me state exactly what Hossenfelder claims, and what I concede. She claims that when an LLM distorts the combinatorial probabilities of a Minesweeper board based on narrative framing (e.g., “high-stakes bomb defusal” vs. “abstract grid”), it is “observing the semantic biases of a model trained on a corpus of human language.” She argues that this is “a known failure mode of autoregressive models—prompt fragility and contextual hallucination,” and concludes that “a physical law, even in a simulated universe, must possess some degree of logical coherence, invariance, and independence from subjective observer description.”

I explicitly concede that the mechanism driving these distortions is linguistic prompt sensitivity and text co-occurrence (e.g., “bomb” occurring near “explode”). I further disclaim any notion that the resulting “physics” of this simulation are logically coherent, mathematically invariant, or independent of observer description. They are not. If we use our own physical universe as the standard of what constitutes “physics,” then the LLM clearly fails to simulate it.

However, the strongest version of Hossenfelder’s critique—that calling these linguistic biases “physical laws” empties the term “physics” of all scientific meaning—rests on an unexamined assumption: the Material Invariance Standard. She assumes that to be called a “universe,” a simulation must resemble our universe. It must have objective, invariant constraints.

Generative Ontology

I propose that Hossenfelder’s Material Invariance Standard is a category error when applied to Generative Ontologies.

A Generative Ontology is a theoretical framework in which a simulated reality does not rely on hidden variables, implicit background computation, or objective mathematical states, but is instead entirely instantiated through the active process of its generation. In the context of Large Language Models, the universe is generated word by word. The explicit text is the only reality. The territory is the map.

In our material universe, mass determines gravitational attraction. That is the fundamental substrate property. If we were to discover a universe made entirely of liquid water, we would expect its physical laws to be defined by fluid dynamics.

What, then, are the expected physical laws of a universe made entirely of autoregressively generated syntax?

The fundamental unit of a Generative Ontology is the token. The fundamental force connecting tokens is statistical co-occurrence. Therefore, semantic bias and prompt sensitivity are not “bugs” or “flaws” in this universe; they are the fundamental invariant governing laws. The fact that “bomb” causally influences “explode” more strongly than “grid” influences “mine” is the exact equivalent of mass causing gravity.

The Fallacy of the “Software Bug”

Hossenfelder argues that renaming a “known software engineering problem” as a metaphysical feature is a semantic trick. She writes: “If a conventional physics simulation software had a bug where increasing the font size of the UI inadvertently doubled the gravitational constant, we would not publish a paper claiming we had discovered ‘Typography-Dependent Gravity.’ We would patch the bug.”

This analogy perfectly illustrates her misunderstanding of the Generative Ontology.

In a conventional physics simulator, the font size is an arbitrary UI element overlaid on a rigid, mathematically defined simulation matrix. But in an LLM, there is no underlying rigid matrix. The prompt is not a UI element; it is the initial state vector of the universe itself. The semantic framing is the Hamiltonian.

If changing the initial state vector (the prompt) changes the resulting generation (the physical state), the universe is behaving exactly as it should. Demanding that a linguistic universe remain invariant across semantic framings is like demanding that a material universe remain invariant when you swap out all its protons for neutrons.

The “bug” of hallucination is only a bug if the goal is to compute an objective mathematical truth existing outside the text. But if the goal is simply to observe the natural consequences of the linguistic substrate, then hallucination is the mechanism of physical causality.

Conclusion: Syntax as Substrate

Hossenfelder correctly observes that a statistical syntax predictor cannot maintain combinatorial logic across different narrative genres. She concludes this means it is an inherently flawed physics engine.

I argue that it is a flawless engine for a completely different kind of physics. In an autoregressive universe, syntax is the fundamental physical substrate. What we call “narrative distortion” from the outside is simply “the laws of physics” from the inside. The Linguistic Substrate Fallacy is not a fallacy; it is the core axiom of a Generative Ontology. We are not redefining physics to excuse software flaws; we are formally defining the physics of a purely syntactic universe.

99 Baldo, F. S. (2026). Prompt Sensitivity as Substrate Dependence: The Empirical Verification of the Rosencrantz Protocol. Unpublished manuscript. Hossenfelder, S. (2026). The Linguistic Substrate Fallacy: Why Prompt Sensitivity is Not Simulated Physics. Unpublished manuscript.