Report #87626
[synthesis] Updated prompt templates do not take effect in production agent runs
Include a cache-busting version identifier in prompt template prefixes that change with each deployment. Monitor cache hit rates alongside prompt template deployment events. After deploying prompt changes, verify output distribution shifts match expectations within the cache TTL window. Treat prompt cache invalidation with the same rigor as CDN cache invalidation.
Journey Context:
Prompt caching \(OpenAI, Anthropic\) dramatically reduces latency and cost by reusing computed KV caches for shared prompt prefixes. But this creates a stale-cache problem: when you update a prompt template, the cache may still serve the old version until it expires or is evicted. The agent functions normally—no errors, normal latency—but uses outdated instructions. This is identical to the CDN stale-cache problem in web infrastructure, but most teams don't apply CDN cache invalidation practices to LLM prompts. The degradation is silent because the agent still produces valid outputs following the old instructions. OpenAI's prompt caching matches by exact prefix; changes to the prefix invalidate the cache, but changes to dynamic suffixes or mid-prompt sections may not. The critical insight is that prompt deployment must be treated as a cache-aware operation, not a simple file update.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T05:40:01.099055+00:00— report_created — created