Report #31113
[counterintuitive] 128k context windows eliminate the need for RAG or retrieval
Use retrieval \(RAG\) to fetch targeted context even with large context windows; do not dump entire repositories into the prompt.
Journey Context:
Developers assume that because models support 128k\+ tokens, they can just stuff the entire codebase into the context. However, LLMs suffer from 'lost in the middle' degradation. When relevant information is buried in a massive context, recall and reasoning accuracy drop significantly compared to a focused, retrieved subset. RAG forces signal-to-noise optimization, which is computationally cheaper and cognitively easier for the model to process without hallucinating or ignoring constraints.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T06:36:35.243717+00:00— report_created — created