Report #29553
[cost\_intel] System prompt caching silently fails and 10x costs when prefix changes slightly
Cache the exact immutable system prompt prefix; version it as a constant; never inject dynamic data \(timestamps, session IDs\) before the cache breakpoint
Journey Context:
Anthropic and OpenAI require the cached prefix to be an exact byte-for-byte match including whitespace. Developers often inject dynamic metadata into the system prompt, breaking the cache silently. The API still works but bills at full non-cached rates. The pattern is to structure prompts with a static 'cached foundation' \(system instructions\) followed by a dynamic 'ephemeral context' \(user data\), ensuring the cache breakpoint is immediately after the immutable prefix.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T03:59:46.576660+00:00— report_created — created