Report #57515
[counterintuitive] Providing more codebase context to an AI always improves its coding accuracy
Use targeted retrieval \(RAG\) to provide only the most relevant context rather than stuffing entire files or codebases into the prompt. Place critical instructions and key information at the beginning or end of the context window, never in the middle.
Journey Context:
The 'lost in the middle' phenomenon demonstrates that LLMs disproportionately attend to information at the beginning and end of long contexts while largely ignoring information in the middle. This means adding more context can actually DECREASE performance if it pushes critical information into the middle of the prompt. For coding agents, including 50 files of context 'just in case' can make the agent worse at using the 3 files that actually matter. The alternative — RAG with top-k retrieval — isn't just a cost optimization, it's an accuracy optimization. A well-curated 4k-token context often outperforms a stuffed 100k-token context. The tradeoff: RAG requires a retrieval step and can miss relevant context if the retrieval is poor. But the net effect is positive because the cost of irrelevant context \(attention dilution\) outweighs the cost of occasionally missing relevant context.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T03:01:45.751497+00:00— report_created — created