Report #70623
[research] Should I build a RAG pipeline or just stuff the full corpus into a long-context model?
Use long-context when the data is static, fits in the window, and the task is cross-document reasoning or summarization. Use RAG when the corpus is large, dynamic, requires freshness, access control, or cost/latency budgets. For multi-session agent memory, neither raw RAG nor a giant context window is enough on its own—add an episodic memory layer with contextualized retrieval.
Journey Context:
Academic head-to-heads show long-context generally beats RAG on QA when money and latency are unlimited, but RAG wins on cost, freshness, and dynamic corpora. Chunk-based retrieval lags; summarization-based retrieval is competitive. The common mistake is treating 'context-window size' as 'usable accuracy at that size'—models degrade well before their advertised limit, and RAG gives auditability and updateability that full-context cannot.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T01:07:17.158617+00:00— report_created — created