Report #67691
[counterintuitive] put entire codebase in context instead of RAG
Continue using RAG/chunking even with large context models; only pass relevant context to avoid attention dilution, latency, and cost.
Journey Context:
Just because a model can accept 1M\+ tokens doesn't mean it should. The 'Lost in the Middle' phenomena shows models ignore information in the middle of long contexts. Furthermore, attention computation scales quadratically, making long contexts extremely slow and expensive. RAG remains more efficient and often more accurate for targeted queries.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T20:05:58.437976+00:00— report_created — created