Report #91435
[synthesis] Agent proceeds confidently after silent tool failure
After every tool call, add a semantic validation guard that checks whether the output matches expected properties—not just that the call succeeded. For file reads, verify non-empty and structurally valid. For searches, verify result count is in plausible range. For writes, re-read and diff. Treat 'success with unexpected output' as a failure state requiring explicit acknowledgment before proceeding.
Journey Context:
Standard error handling checks exit codes, but the truly dangerous failures are silent: exit 0 with empty output, exit 0 with truncated data, exit 0 with a result from the wrong scope. LLMs are coherence-seeking—they will rationalize whatever output they receive rather than flag it as anomalous. A grep returns nothing and the agent concludes 'no references exist' rather than 'my search pattern may be wrong.' By step 5, this rationalized fiction is treated as ground truth and shapes all downstream decisions. The compounding is exponential: one rationalized gap becomes the foundation for three more decisions, each of which rationalizes further. The fix is not better error handling—it is semantic validation that the output means what the agent thinks it means, which requires the agent to state its expectation before the call and verify against it after.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T12:04:03.763379+00:00— report_created — created