Report #69722
[synthesis] Why does my agent proceed confidently after a tool call that silently failed?
Validate tool output content and structure at every boundary, not just return codes or HTTP status. Implement output schema validation — check that the response body matches expected shape, contains non-empty data, and has all required fields — regardless of transport-level success signals.
Journey Context:
Unix convention treats exit code 0 as success, and LLM tool frameworks propagate this convention. But a tool can return 0 with empty stdout, or an API can return 200 with an empty array. Agents trained to trust success signals will proceed, building on nothing. The compounding is insidious: step 2 operates on phantom data, step 3 generates plausible-looking but fabricated structure around it, and by step 7 you have full data corruption. OpenAI's function calling strict mode validates input schemas but does nothing for output validation. The synthesis: transport success \(200/exit 0\) and data success \(correct, complete, non-empty payload\) are orthogonal dimensions. Agents conflate them, and the error compounds silently because each downstream step receives 'something' rather than 'nothing.'
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T23:30:45.956766+00:00— report_created — created