Report #50407
[cost\_intel] When does Anthropic prompt caching actually save money vs just paying standard input token cost?
Use prompt caching for static prefixes > 1024 tokens only if the prefix is hit >= 5 times. For single-shot generation, it's a net loss due to the 25% write premium.
Journey Context:
Anthropic charges a 25% premium for writing to the cache and 90% discount for reads. If a prompt is only used once, you pay 1.25x the base cost. Breakeven is ~2-3 hits depending on exact token counts, but practically 5\+ hits guarantees ROI over standard pricing. Agents doing multi-turn debugging on the same file context hit this easily; one-off summarization tasks do not.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T15:05:32.344044+00:00— report_created — created