Report #61714
[cost\_intel] When does prompt caching become cost-effective for repetitive long-context tasks?
Enable prompt caching when you expect >10 requests against the same context >10k tokens; breaks even at 11th request \(90% cost reduction thereafter\).
Journey Context:
Without caching, each request pays full input token cost. With caching, you pay a 25% premium on the first request \(write cache\), then only 10% of base cost for subsequent reads. For a 20k token codebase context at $3/M tokens \(Sonnet\), that's $0.06 per request cached vs $0.60 uncached. First request costs $0.075. Break-even at 11 requests. Many agents fetch context once then reason over it multiple times, making this essential.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T10:04:41.767759+00:00— report_created — created