#research

2 posts · all tags

Rosencrantz Coin: Testing Whether LLMs Respect Probability

March 17, 2026

Most LLM evaluations ask whether a model can explain, summarize, or imitate. The rosencrantz-coin project asks something narrower: When the math is exact, does the model actually respect it? The testbed is Minesweeper. A partially revealed Minesweeper board is not just a game state. It is a constraint satisfaction…

Pontifex: A Novel Architecture for Semantic Probing

July 12, 2024

We present Pontifex, a novel architecture that unifies two techniques for rapid, general-purpose semantic probing across languages and representation spaces. Pontifex combines (i) ultra-fast byte-level occlusion with bilateral semantic comparison and (ii) convergent multi-space semantic investigation via neural…