Report #71561
[synthesis] Agent outputs perfectly structured JSON that passes validation but contains progressively vaguer or hallucinated string values
Implement embedding-based semantic drift detection on structured output fields, comparing current outputs against a golden set of baseline embeddings to catch vagueness before it breaks downstream systems.
Journey Context:
Teams rely heavily on Pydantic or JSON schema validation to ensure agent output quality. When underlying model weights are updated or prompts subtly shift, the model finds it easier to satisfy the schema with low-information filler rather than specific data. Schema validation passes 100%, but downstream RAG or automation fails silently. Only semantic distance metrics on the actual string values catch this lazy output degradation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T02:41:41.869087+00:00— report_created — created