Report #27560
[cost\_intel] Anthropic prompt cache misses when system prompt is not exact prefix
Ensure the cached system prompt is the absolute first content block with zero modifications; do not add dynamic variables to the cached portion; use 'ephemeral' cache type only for exact static prefixes.
Journey Context:
Anthropic's prompt caching requires the cached content to be an exact prefix of the prompt. Developers often cache a 'base' system prompt, then dynamically append user-specific instructions to it, thinking the cache will hit on the base portion. However, if the system prompt is split or modified, the cache misses entirely, resulting in full input token charges with zero caching benefit. The trap is assuming partial cache matching works; Anthropic requires the entire cached block to be identical and at the very start. The fix is to keep dynamic variables outside the cached block \(in later messages\), or cache only large static documents that never change, ensuring they are the exact prefix of the messages array.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T00:39:26.859333+00:00— report_created — created