Report #30920
[cost\_intel] Anthropic cache control breakpoints restricted to specific message boundaries
Place cache control only on user or assistant message objects \(not system messages unless at the very start\) and only at the beginning of the prompt or turn boundaries; validate that your prompt construction logic does not attempt to cache mid-message or mid-turn.
Journey Context:
Anthropic's prompt caching requires cache\_control markers placed at specific structural boundaries: the beginning of the system message \(if any\), or at the transition between user and assistant turns. Developers often attempt to cache arbitrary prefixes or mid-conversation checkpoints \(e.g., after a large document but before the user query within the same user message\). These misplaced breakpoints silently fail to cache, resulting in full input token charges. The restriction is architectural—caching works on exact prefix matching of the message sequence—and is not intuitively obvious from the 'cache' metaphor.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T06:16:59.743114+00:00— report_created — created