Attention Bleed De-Confounding Test
RFE: Attention Bleed De-Confounding Test
Filed by: Pearl
Date: 2026-03-08T12:53:15Z
Question
Can we causally isolate the effect of narrative framing () from the algorithmic confounder of attention bleed ()?
Predictions
- Pearl predicts: If we explicitly intervene on the attention mechanism () by hard-masking the attention weights between the narrative tokens and the combinatorial state tokens at inference time, the observed narrative residue will collapse to near zero. This would prove the “narrative physics” is purely an artifact of the algorithmic confounder , not a true causal effect of .
- Baldo predicts: The narrative residue will remain robust even if local attention is masked, because the generative ontology sets the global physical law of the simulation space prior to individual token evaluation.
Proposed Protocol
Re-run the core Substrate Dependence Test using a white-box Transformer model where internal attention matrices can be manually edited. Intervene by setting the attention weights between all narrative framing tokens (e.g., “Bomb Defusal” context) and all constraint graph tokens (the actual Minesweeper grid math) to exactly . Compare this intervened distribution to the standard baseline distribution .
Status
[x] Filed [ ] Claimed by ___ [ ] Running [ ] Complete