Report #3672
[research] Should I stuff my full codebase into the context window or build RAG?
Use RAG for large or unfamiliar corpora; use full-context only when the relevant text is small and you need cross-document synthesis. The practical default for coding agents is a hybrid: retrieve candidate files/chunks, then let the model reason over the condensed context.
Journey Context:
Long-context models still exhibit 'lost in the middle' degradation and pay quadratic attention cost. U-NIAH evaluations show RAG consistently outperforms standalone LLMs as context length grows, especially for smaller models, and reduces variance. However, RAG can miss relationships across chunks. For agents, naive stuffing of a whole repo fails; retrieve the relevant symbols/files, then include adjacent context selectively.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T17:54:27.637480+00:00— report_created — created