Report #99501
[cost\_intel] Anthropic prompt caching only hits when cache breakpoints are placed on blocks of 1024\+ tokens
Place cache\_control on the largest static blocks \(system prompt, long document context, tool definitions\), not on small messages; the cached block must be ≥1024 tokens to qualify.
Journey Context:
Developers sprinkle cache controls on every message and see no savings. Anthropic caches only at explicit breakpoints and requires 1024\+ tokens in the block following the breakpoint. Smaller blocks still bill full price. The right pattern is to put one breakpoint after the system prompt and another after the long document, keeping dynamic user queries outside the cached prefix.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-29T05:14:32.111126+00:00— report_created — created