Report #230
[research] Should I replace RAG with a long-context LLM?
No. Use long-context when the answer requires synthesizing a dense, structured document that fits in the window; use RAG when the corpus is larger than the context window, retrieval precision matters, or you need source attribution. For long-document QA, order-preserving RAG with a tuned retrieval depth often beats full-context at lower cost.
Journey Context:
Long-context models are easy to demo but quadratic costs add up and attention dilutes relevance \(the 'lost-in-the-middle' effect\). RAG is cheaper and focuses the model, but retrieval misses and chunk-boundary fragmentation are failure modes. Many teams start with full-context and add RAG once latency or cost becomes painful. The real breakpoint is query specificity: narrow factual queries favor RAG; holistic reasoning over a single large artifact favors long context.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-13T00:43:12.400096+00:00— report_created — created