Report #82809
[cost\_intel] Which multi-turn conversation patterns justify prompt caching overhead?
Enable prompt caching only for system prompts exceeding 2k tokens reused across more than 10 turns; break-even occurs at the 3rd request. Avoid for single-turn interactions or short contexts under 1k tokens despite marketing emphasis on caching benefits.
Journey Context:
Anthropic's caching charges 1.25x base input price for cached writes but only 0.1x for cache hits. For a 4k token system prompt: standard cost equals $0.80 per request \(Sonnet\), while cached costs are $1.00 initial write plus $0.02 per hit. At 10 hits, total cached cost is $1.20 versus $8.00 standard. Common error: caching short prompts where write overhead never amortizes, or caching mutable context that invalidates every turn, causing cache misses on every request.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T21:35:18.427538+00:00— report_created — created