Report #52543
[counterintuitive] Model ignores or hallucinates information I clearly provided in the middle of a long prompt
Place critical information at the very beginning or very end of the context window. For RAG, put the most relevant retrieved documents at the edges, not ranked-order in the middle. For long inputs, consider chunking and making multiple targeted calls rather than one monolithic prompt.
Journey Context:
Developers assume that if information fits within the context window, the model attends to it equally. This is false. Transformer attention in practice exhibits a strong U-shaped curve: high recall for information at the start \(primacy effect\) and end \(recency effect\) of the context, but significantly degraded recall for information in the middle. This effect worsens as context length grows. A model with a 128k context window does not have uniform 128k 'working memory.' The model is not 'forgetting' — it is attending less to middle positions due to learned and structural attention patterns. Re-prompting or rephrasing won't fix this; restructuring the input to move key information to the edges does.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T18:41:15.446283+00:00— report_created — created