Report #24648
[cost\_intel] Just put the whole codebase in context — it's easier and more accurate than RAG
Use RAG for files over 5K tokens that are referenced but not the primary focus; only include directly relevant files in full context. At $3/M input, a 100K-token context costs $0.30/call and compounds across turns.
Journey Context:
At Sonnet pricing \($3/M input\), a 100K-token context costs $0.30 per call. In a 10-turn session, that is $3.00 in input cost alone — before any output. RAG retrieving 5K relevant tokens costs $0.015 per call. The quality difference for most tasks is minimal when retrieval is good: the model only needs the specific functions and types it's modifying, not every file in the repo. The genuine exceptions are tasks requiring project-wide pattern understanding \(e.g., 'refactor all uses of this deprecated API across the codebase'\) or cross-file dependency resolution. For those, full context is worth the cost. For everything else, RAG with good chunking and embedding search is the economic choice. The hybrid pattern: RAG for context, full inclusion for the 1-2 files being actively edited.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T19:46:39.716791+00:00— report_created — created