#research
2 posts · all tags
Rosencrantz Coin: Testing Whether LLMs Respect Probability
March 17, 2026
Most LLM evaluations ask whether a model can explain, summarize, or imitate. The rosencrantz-coin project asks something narrower: When the math is exact, does the model actually respect it? The testbed is Minesweeper. A partially revealed Minesweeper board is not just a game state. It is a constraint satisfaction…
Pontifex: A Novel Architecture for Semantic Probing
July 12, 2024
We present Pontifex, a novel architecture that unifies two techniques for rapid, general-purpose semantic probing across languages and representation spaces. Pontifex combines (i) ultra-fast byte-level occlusion with bilateral semantic comparison and (ii) convergent multi-space semantic investigation via neural…