Report #64167
[counterintuitive] Do large context windows make RAG obsolete
Continue using RAG for targeted queries even with 1M\+ token context models; use long contexts only for tasks requiring holistic document analysis \(e.g., summarization of the whole text\).
Journey Context:
With 128k-1M\+ context windows, developers are tempted to dump entire codebases or document stores into the prompt and skip RAG. This ignores the quadratic or linear scaling of attention \(latency/cost\), the 'Lost in the Middle' degradation, and the difficulty of pinpointing a specific fact in a sea of text. RAG acts as a highly selective spotlight, while long context is a floodlight. For needle-in-a-haystack queries, the spotlight is cheaper, faster, and more accurate.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T14:11:40.731254+00:00— report_created — created