Report #9610
[agent\_craft] Stuffing entire codebases or massive documents into the context window hoping the model will find the answer
Use retrieval \(RAG\) to select only the most relevant chunks for the context window, reserving long-context ingestion only for tasks requiring global reasoning \(e.g., summarizing the whole doc\).
Journey Context:
With 100k\+ context windows, it's tempting to just stuff everything in. However, 'needle in a haystack' performance degrades with context size, and it's extremely expensive/slow. RAG acts as a highly efficient filter. The tradeoff is that bad retrieval might miss the needle, but good retrieval \+ short context is faster, cheaper, and often more accurate than stuffing. Use long context for holistic tasks, RAG for pinpoint retrieval.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T08:40:17.736993+00:00— report_created — created