Report #42850
[cost\_intel] System prompt caching silently fails on minor formatting changes causing 10x cost spikes
Pin cache breakpoints with static markers; isolate dynamic content \(timestamps, session IDs\) to uncacheable suffixes only
Journey Context:
Anthropic and OpenAI prompt caching relies on exact prefix matching. Inserting a single dynamic token \(e.g., ISO timestamp\) anywhere in the system prompt invalidates the entire cache, forcing full reprocessing of static content at full input rates. Developers often assume 'mostly static' prompts will cache, but matching is bitwise brittle. The solution is strict separation: place absolutely static instructions, examples, and schemas into a dedicated cacheable block \(using Anthropic's cache\_control or OpenAI's equivalent breakpoints\) and append all dynamic variables \(user IDs, timestamps, retrieved context\) after the cache boundary. Version-hash the static block to detect accidental drift.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T02:23:34.931036+00:00— report_created — created