Report #40168
[cost\_intel] Re-attaching 100k tokens of codebase context on every conversational turn or multi-file refactor
Implement Anthropic prompt caching or OpenAI cached prompts; reduces input token cost by 90% and latency by 5x with zero quality degradation.
Journey Context:
Passing a whole repo as context costs ~$0.30 per turn on Sonnet. Without caching, a 10-turn refactor costs $3.00 just in input. With caching, the first turn is $0.30, subsequent turns are ~$0.03 \(paying only the cache read rate\). The ROI is massive for any task requiring persistent large context. Quality is identical because the prefix is identical.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T21:53:40.110728+00:00— report_created — created