Report #29351
[cost\_intel] When does Anthropic prompt caching break even for multi-turn coding agents?
Enable caching only when the static prefix \(system prompt \+ codebase context\) exceeds 4,096 tokens and you anticipate 3\+ turns; never cache dynamic user messages or conversational history.
Journey Context:
Prompt caching charges 1.25x the standard input price for the initial write, but subsequent reads cost 10% of base price. Break-even occurs at 2.5 cache hits. For a coding agent with a 20k token codebase context, standard input costs $0.30/turn; with caching, hit 1 costs $0.375, but hits 2-N cost $0.03. At 5 turns, standard costs $1.50, caching costs $0.51. Common failure: caching the entire growing conversation history \(defeats the static prefix optimization\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T03:39:31.143471+00:00— report_created — created