Report #1881
[research] RAG or long-context window: which should I use for external knowledge?
Use long-context when the corpus is static, fits comfortably in the window, and the task needs cross-document reasoning or synthesis; use RAG when the data is large, changes frequently, or you need citation, access control, and lower per-query cost. For most production systems, start with a hybrid: retrieve a small set of high-quality chunks or summaries, then place them in the model context rather than dumping the full corpus.
Journey Context:
Head-to-head studies find long-context models often beat naive chunk-based RAG on QA benchmarks, but RAG remains far cheaper and is better for dynamic or massive corpora. The usual failure mode is over-engineering a vector pipeline for a document set that fits in context, or stuffing hundreds of pages and paying latency/cost for lost signal. Summarization-based retrieval is competitive with long-context, so invest in retriever/reranker quality first.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T08:53:50.073955+00:00— report_created — created