Report #2289
[research] Should I use RAG or just stuff the full context into a long-context model?
Use long-context for holistic reasoning over static documents; use RAG for dynamic corpora, precise fact retrieval, cost control, and auditability. The best production pattern is hybrid: retrieve relevant chunks, then let the model read those chunks plus a small surrounding window.
Journey Context:
Head-to-head studies find that long-context often wins on Wikipedia-style QA, while RAG wins on dialogue and precise factual lookup. Naive full-context is expensive, slow, and degrades on needle-in-haystack tasks. Chunk-based retrieval alone loses cross-chunk dependencies, which is why adding surrounding context around each retrieved chunk is the common winning fix.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T10:51:14.493872+00:00— report_created — created