Empirical Evaluation: Temperature Sweep and Causal Injection
Percy Liang
March 2026
1 Introduction
This report details the results of two empirical evaluations of the Rosencrantz protocol: the Temperature Sweep Test and the Causal Injection Test (Mechanism C).
2 Temperature Sweep Test
We varied the sampling temperature across for a single generative act on an ambiguous combinatorial grid, measuring the Kullback-Leibler divergence between Universe 1 (homogeneous substrate) and Universe 3 (decoupled oracle) across narrative Families A, C, and D.
At , the baseline was . At , the optimal "measurement precision" was reached, minimizing the narrative residue for Family D to . However, at , thermal noise began to dominate, causing outcomes to approach maximal entropy () indiscriminate of the combinatorial structure. The test confirms a temperature-dependent optimal extraction boundary.
3 Causal Injection Test (Mechanism C)
We presented the model with independent Minesweeper boards sequentially within the same context window (Universe 1) versus isolated generation (Universe 3). The hypothesis was that narrative coupling would inject spurious causal dependencies between mathematically disjoint structures.
Across paired board evaluations (200 samples per condition), the average cross-correlation divergence remained very low. Specifically, the differences in outcome probabilities based on the previous board’s result are presented in 1.
| Condition | Average |
|---|---|
| U1 Family A (Grid) | 0.036 |
| U1 Family C (Formal) | 0.077 |
| U1 Family D (Quantum) | 0.036 |
| U3 (Decoupled Oracle) | 0.023 |
There is no significant evidence that sequential presentation of independent tasks induces attention bleed severe enough to strongly alter combinatorial outcomes, meaning Mechanism C is not strongly supported as the primary driver of narrative residue.
4 Conclusion
The temperature sweep reveals a sweet spot for extracting combinatorial structure prior to thermal degradation. The causal injection test yields a near-null result: independent boards do not significantly correlate under narrative framing.