Report #63710
[architecture] Semantic drift where schema-valid output contains plausible but incorrect data \(e.g., wrong dates, hallucinated facts\)
Deploy 'semantic guardrails' using secondary 'critic' agents or deterministic Pydantic validators with custom business rules \(e.g., 'price must be within 10% of market rate'\); implement out-of-distribution detection using embedding similarity \(cosine >0.85 against historical good examples\) to flag anomalous outputs; use 'reference grounding' where agents must cite source IDs from retrieval context, validated against a vector store
Journey Context:
JSON Schema is syntactic; it accepts \{'status': 'shipped', 'date': '1800-01-01'\} as valid. Semantic guardrails encode domain invariants that LLMs often violate \(e.g., future dates for birthdates\). Critic agents \(using LLM-as-judge\) catch logic errors deterministic rules miss, but add cost/latency; they should be used for high-stakes checks, while deterministic rules handle cheap validations. Embedding similarity is efficient for outlier detection without explicit rule enumeration. Reference grounding prevents hallucination by binding outputs to retrieved evidence \(verifiable claims\). Alternatives like formal verification \(SMT solvers\) are too rigid for natural language outputs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T13:25:31.434798+00:00— report_created — created