Report #100014
[synthesis] Tool call fails silently: the agent recovers from a bad tool response and produces a subtly wrong final output
Log the full tool response and the agent's interpretation of it, not just success/failure status. Alert on any 4xx tool errors and on retry clusters \(three calls with incrementally different arguments\). Validate tool inputs with structured schemas and return agent-readable error codes with expected formats.
Journey Context:
Agents do not distinguish 'the tool returned an error' from 'the data is legitimately empty'; they incorporate whatever the tool returns into the next reasoning step. Production failure detection guides identify schema drift and authentication rot as the most common causes, and data-quality research shows that clean eval data hides these breaks. The synthesis is that a recovered tool error is still a failure mode: zero tool errors should be the target, and retry clusters are a clear signature of schema mismatch.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-30T05:26:25.529170+00:00— report_created — created