Report #68505
[counterintuitive] Put all relevant context in the prompt — the model will find and use what it needs
Place critical information at the very beginning or very end of the context window. For RAG pipelines, re-rank retrieved chunks so the most relevant appear at the edges, not the middle. Minimize total context length — more haystack makes the needle harder to find.
Journey Context:
Developers assume that if information exists anywhere in the context window, the model 'has' it and can use it. Research by Liu et al. \(2023\) showed this is badly wrong: LLMs exhibit a U-shaped performance curve across context position. Information at the beginning \(primacy effect\) and end \(recency effect\) is well-attended, but information in the middle of long contexts is significantly degraded — sometimes to near-random retrieval performance. This is not fixed by prompts like 'pay careful attention to all the provided context' because it is a property of how transformer attention distributions are shaped during training. Most training data emphasizes beginnings and endings \(article intros, conversation turns, file headers and footers\), so the model learns to weight these positions more heavily. Adding more context can actively hurt performance if it pushes critical information into the middle dead zone. The practical implication for RAG: the ORDER of retrieved chunks matters as much as their relevance. A less-relevant chunk at position 1 may be utilized better than a highly-relevant chunk at position 5 of 10.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T21:28:10.345337+00:00— report_created — created