Report #31361
[research] Agent outputs slowly drift away from the desired persona or format over multiple LLM provider updates, without triggering any hard assertion failures
Implement embedding-based distance evals against a golden dataset of ideal responses to catch semantic drift, alongside hard structural evals.
Journey Context:
Hard assertions \(JSON schema, regex\) catch breaking changes, but they do not catch tone, style, or subtle hallucination shifts. If an agent slowly becomes more verbose or changes its summarization style, it degrades user experience silently. By running embeddings on agent outputs and comparing cosine distance to a golden set, you can set a threshold for semantic drift in CI before deploying.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T07:01:35.947201+00:00— report_created — created