Report #75560
[counterintuitive] large context window replaces RAG
Continue using RAG/vector search for large knowledge bases; only use massive context windows for processing single large documents \(e.g., summarization\).
Journey Context:
With 1M\+ token context windows, developers assume they can just dump the whole DB into the prompt. This fails because: 1\) Attention dilution \(Lost in the Middle\), 2\) Massive cost \(input tokens are billed\), 3\) Massive latency. RAG is an O\(1\) retrieval cost, while full context is O\(N\) and suffers from diminishing returns as noise increases.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T09:25:35.837318+00:00— report_created — created