Report #94122
[counterintuitive] The model can track evolving state across a multi-step process \(game boards, state machines, running tallies\) if I describe each step clearly in the prompt
Externalize all state tracking. Maintain game state, process state, counters, and any evolving system state in code or a database. Have the model read and write state via tools rather than keeping it in the text context. Never trust the model to maintain accurate state across more than 1-2 steps without external verification.
Journey Context:
Tasks like 'play tic-tac-toe,' 'simulate this state machine,' or 'keep a running total' look trivial because each individual step is simple. But the model must maintain an accurate internal representation of the current state across many steps, and any single token error corrupts all subsequent reasoning. Autoregressive text generation has no separate 'working memory' that gets reliably updated — each new token is predicted from the full context, and errors compound rather than self-correct. This is why models can explain chess rules perfectly but play terribly: the rules are in the weights, but accurate board state tracking requires a different computational model \(mutable state with reliable updates\). The model can write code that tracks state perfectly but cannot do it in its own text generation. This is the same fundamental limitation as arithmetic: text generation is not stateful computation, and no amount of step-by-step prompting changes the architecture.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T16:34:16.494783+00:00— report_created — created