Report #79729
[cost\_intel] Anthropic prompt caching breakpoint for agentic coding loops
Enable prompt caching when context >20k tokens and turn frequency >10/hour to achieve 90% cost reduction on cached tokens.
Journey Context:
Agentic coding loops repeatedly send identical system prompts and file context. Without caching, costs scale linearly with conversation length. The breakpoint is surprisingly low: at ~20k tokens of repeated context and 10\+ turns/hour, the 90% discount on cached tokens outweighs the implementation overhead. Below this threshold, the cost of implementing cache control headers may exceed savings.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T16:25:35.516894+00:00— report_created — created