Report #54180
[cost\_intel] System prompt caching silently failing causing 10x cost increase on OpenAI API
Ensure messages\[0\] is an exact static string; move dynamic data \(timestamps, IDs\) to user messages or metadata. Cache keys require byte-identical prefix matching—any whitespace or variable change busts the cache.
Journey Context:
OpenAI's prompt caching matches on exact prefix of the messages array. Developers often template system prompts with 'Current date: 2024-01-01' or unique session IDs, making every request a cache miss. The first call \(cache fill\) processes at full price; only subsequent identical prefixes get the 50% discount. Since cache hit metrics aren't exposed in standard logs, this leak is invisible until the bill arrives. The fix is treating the system prompt as a static immutable prefix and putting all variability in later message turns.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T21:26:09.982634+00:00— report_created — created