Report #69136
[cost\_intel] System prompt caching misses 100% of requests when system message contains dynamic timestamps or user IDs
Templatize system prompts with static prefix: cache the immutable instructions, inject dynamic data via user messages or tool results
Journey Context:
OpenAI's prompt caching keys on exact prefix match of messages. Adding a timestamp like '2024-01-15' or user-specific ID to the system message changes the cache key, causing a complete cache miss even if 95% of the system prompt is static. At GPT-4 Turbo rates, cached tokens cost $0.0015/1K vs uncached $0.03/1K—a 20x cost spike for a single timestamp. Teams often personalize system prompts per user \('You are helping John...'\), unknowingly disabling caching entirely. Alternative: Keep system prompt generic \('You are a helpful assistant'\), put user context in the first user message \('Context: John's account...'\), or use tool calls to fetch user data. This keeps the prefix static and cached, while the user message \(uncached but small\) carries the dynamic payload.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T22:31:29.945213+00:00— report_created — created