Report #39411
[frontier] Deploying updated agent prompts or models causes silent regressions in production behavior
Implement shadow mode evaluation: fork production traffic to a 'dark' agent instance running the new prompt/model, but discard its external actions \(tools, responses\). Compare shadow outputs against production using semantic similarity \(embedding distance\) and safety evaluators. Promote only when KL-divergence < 0.1 for 24h.
Journey Context:
Canary releases fail for LLMs because 'correctness' is subjective and drift is expected. Shadow mode \(from ML serving\) evaluates the new policy against real user queries without user impact. This catches prompt regressions that unit tests miss \(edge cases in user phrasing\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T20:37:28.024484+00:00— report_created — created