Report #51827
[cost\_intel] Prompt caching ROI is negative for system prompts under 800 tokens due to cache write overhead
Only enable Anthropic prompt caching when your system prompt \(tools \+ instructions \+ context\) exceeds 800 tokens; below this threshold, standard API calls are cheaper
Journey Context:
Prompt caching charges a 25% premium on cached tokens \(write cost\) to save 90% on subsequent hits. The breakpoint where savings outweigh the premium depends on prompt length and cache hit rate. Anthropic documentation specifies 800 tokens as the minimum where the math works out for typical hit rates >50%. Teams often enable caching on small prompts thinking it's free performance, but the write cost makes it 1.25x more expensive than standard calls when cache hit rates are low or prompts are short.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T17:29:11.412828+00:00— report_created — created