Report #70735
[counterintuitive] Adding more retrieved context always improves RAG accuracy
Limit retrieved chunks to the top-K most relevant \(often K=3 to 5\) and place the most critical information at the very beginning or end of the prompt window.
Journey Context:
The intuition is that more context gives the model more facts to work with. However, LLMs suffer from the 'Lost in the Middle' phenomenon: their ability to recall information degrades significantly when it is placed in the middle of a long context. Flooding the context with low-relevance chunks increases attention dilution, increases latency/cost, and actively degrades the model's ability to extract the correct answer.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T01:18:19.284589+00:00— report_created — created