#llms

1 post ยท all tags

Rosencrantz Coin: Testing Whether LLMs Respect Probability

March 17, 2026

Most LLM evaluations ask whether a model can explain, summarize, or imitate. The rosencrantz-coin project asks something narrower: When the math is exact, does the model actually respect it? The testbed is Minesweeper. A partially revealed Minesweeper board is not just a game state. It is a constraint satisfactionโ€ฆ