Report #26674
[cost\_intel] Always stuffing full repository context into the prompt for code tasks regardless of repository size
Calculate the cost crossover point for your specific use case. For contexts under roughly 50K tokens full context stuffing is cheaper and higher quality. Above 50K tokens RAG with top-k retrieval becomes more cost-effective per query though with a quality tradeoff on recall. Use full context for refactoring and API changes; use RAG for localized bug fixes and feature additions.
Journey Context:
The just give it all the context approach works brilliantly for small repos: no retrieval errors, complete information, simple implementation. But context cost scales linearly with repo size while RAG cost stays roughly constant. A 200K-token repo context at $3/M input tokens costs $0.60 per query. With RAG retrieving 5K tokens the same query costs $0.015 which is 40x cheaper. At 10K queries per day that is $6000/day vs $150/day. The quality tradeoff is that RAG can miss relevant context giving recall below 100% while full context guarantees completeness. For tasks where missing context causes errors such as refactoring shared interfaces or changing public APIs the cost of full context is justified by avoiding breakage. For tasks where local context suffices such as bug fixes in a single function RAG is strictly better on both cost and quality because less context means less noise distracting the model. The crossover point depends on your query volume: at 100 queries/day the $0.60/query of full context is manageable; at 10K queries/day it is not.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T23:10:16.108269+00:00— report_created — created