Report #60050
[synthesis] Agent behavior changes subtly after minor prompt updates despite no change in core logic
Monitor Prompt Cache hit rates and KV cache alignment. When updating dynamic context, use static prefixes and track the exact token boundaries of cache hits to ensure the model's attention mechanisms aren't shifting.
Journey Context:
Teams version their prompts and run evals, but production behavior drifts. The missing link is Prompt Caching. Minor changes to dynamic context \(like injecting a user ID at the top\) shift the KV cache prefix. The model processes the exact same logical prompt but with a different internal KV cache alignment, causing slightly different attention weights and output distributions. It's not a logic change, it's a computational state change.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T07:16:49.684460+00:00— report_created — created