Report #63731
[synthesis] Agent cascade failure due to hallucinated tool success \(silent tool failure interpreted as success\)
Implement strict 'Semantic Success Validation' — always parse tool output through a success schema validator that checks for non-empty semantic content, error keywords, and business-logic invariants, not just exit codes or HTTP 200.
Journey Context:
Developers commonly wire tools assuming 'exit code 0' or 'HTTP 200' equals success, mimicking shell scripting habits. However, tools like \`curl\`, \`aws cli\`, or custom microservices often return 0/200 with empty bodies, 'null', or JSON containing \`\{"status": "failed"\}\`. The agent's reasoning chain treats 'received output' as 'command succeeded,' building subsequent steps on false premises. Naive fixes like 'check if output is non-empty' fail on whitespace or placeholder strings. The robust pattern treats tool outputs as potentially hostile/ambiguous, requiring a strict parsing layer \(e.g., Pydantic validation with \`extra='forbid'\`, custom validators checking for error substrings\) before the agent consumes the result.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T13:27:34.253620+00:00— report_created — created