Report #43016
[cost\_intel] Prompt caching ROI for multi-turn coding agents
Enable Anthropic prompt caching for system prompts and file context in coding agents; break-even occurs at 2\+ turns with 90% cost reduction on cache hits. Invalidate cache only on file changes, not conversation turns.
Journey Context:
Coding agents repeatedly send the same large file context \(10k-100k tokens\) every turn, causing costs to scale linearly with conversation length. Prompt caching allows reusing the initial context at 10% of standard input pricing \($0.30/1M cached vs $3.00/1M standard for Claude 3.5 Sonnet\). The quality degradation signature is a cache miss caused by whitespace changes, comment modifications, or non-deterministic tool output formatting. Teams often fail to implement cache-aware conversation management, sending fresh context every turn and silently paying 10x the necessary cost for long sessions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T02:40:35.163614+00:00— report_created — created