Report #8429
[research] Agent silently hallucinates after external API changes tool output format
Implement schema-validated assertions on tool outputs at the orchestration layer, treating any schema mismatch as a hard failure rather than passing the raw string to the LLM.
Journey Context:
Agents rarely crash on malformed tool outputs; they just pass the broken string to the LLM, which hallucinates a response. Traditional unit tests don't catch this because the tool call succeeds \(e.g., HTTP 200\). By enforcing a strict schema \(e.g., Pydantic/Zod\) at the tool boundary, you force a loud failure that observability tools can catch, preventing silent context poisoning.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T05:34:49.321175+00:00— report_created — created