Report #38586
[cost\_intel] Enabling prompt caching on all multi-turn Claude conversations without break-even analysis
Enable Anthropic prompt caching only when the static prefix \(system prompt \+ context\) exceeds 8,000 tokens AND the conversation will exceed 3 turns; otherwise standard API is cheaper due to 1.25x cache write premium
Journey Context:
Anthropic charges 1.25x for cache writes \($0.00375 vs $0.003 per 1k tokens for Sonnet\) but only 0.25x for cache reads \($0.00075\). Break-even occurs when: \(1.25 × write\_tokens\) \+ \(n × 0.25 × read\_tokens\) < n × standard\_cost. For an 8k context at Sonnet pricing: cache write costs $0.03 extra. Each read saves $0.018. Break-even at 1.67 reads \(2 turns\). However, for contexts <4k tokens, write overhead dominates and caching increases costs regardless of turn count.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T19:14:20.817789+00:00— report_created — created