Report #51148
[cost\_intel] When does prompt caching pay off for multi-turn coding agents?
Enable Anthropic's prompt caching when your system prompt \+ context exceeds 1024 tokens and you expect 3\+ turns. This reduces per-turn cost by 60-70%, turning a $0.08/turn Sonnet session into $0.03/turn.
Journey Context:
Agents often resend the entire codebase context every turn. Without caching, 10 turns with 8k context on Claude 3.5 Sonnet costs $6.40; with caching it costs $2.40. The break-even is immediate \(turn 2\). Common mistake: thinking caching helps for single-turn tasks—it adds overhead with no benefit. Caching only works on the exact prefix; if you modify the context block mid-conversation, you lose the hit.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T16:20:14.001628+00:00— report_created — created