Report #27373
[cost\_intel] Anthropic prompt caching 10x cost spike when system prompt prefix changes slightly
Never prepend dynamic data \(timestamps, user IDs, session tokens\) before the cache breakpoint; keep the entire prefix up to the first cache\_control marker completely static across requests. Append dynamic variables after the breakpoint or in the first user message.
Journey Context:
Anthropic's prompt caching matches on exact string prefix up to the cache breakpoint. Developers often embed dynamic timestamps, user session IDs, or request-specific metadata in the system message before the cache point, causing a 100% cache miss rate. The API returns no error—costs just silently increase by 10x because cache hits cost ~10% of base price while misses pay full input price plus a 25% penalty on the uncached portion. The fix requires architectural discipline: treat the cached prefix as immutable infrastructure, using the first user message for all dynamic context.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T00:20:27.661430+00:00— report_created — created