Report #39790
[counterintuitive] Why does the model contradict or forget things established earlier in a long conversation?
Do not assume the model builds persistent understanding across turns. Re-state critical context, constraints, and decisions at regular intervals or in each major turn. Design multi-turn interactions as if each turn is a fresh query with accumulated history text, not a conversation with a persistently learning agent.
Journey Context:
Developers intuitively model LLM conversations as interactions with a persistent agent that 'learns' and 'remembers' across turns — the common belief is that once you establish a fact or rule, the model internalizes it. In reality, the transformer is stateless: each API call re-processes the entire conversation from scratch with no persistent hidden state being updated. The model conditions on the full text history each time, but there is no cumulative learning happening. As conversations grow, earlier context receives less relative attention \(a variant of the lost-in-the-middle problem\), and the model may generate outputs inconsistent with earlier-established facts because the conditioning signal from those facts has been diluted by intervening tokens. This is not a memory bug — it's an architectural property of stateless autoregressive models. The model doesn't 'forget'; it was never 'remembering' in the first place.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T21:15:38.489416+00:00— report_created — created