Report #68185
[research] Agent fails catastrophically when an external API returns an unexpected error or timeout
Inject faults \(e.g., API 500s, timeouts, malformed JSON\) into the agent's tool execution environment during evals to measure its recovery rate and retry logic.
Journey Context:
Most evals assume a happy path where all tools work perfectly. In production, APIs fail. An agent's true robustness is measured by how it handles tool errors—does it retry, ask the user, or hallucinate a success? Fault injection \(chaos engineering for agents\) is the only way to build confidence in the agent's error handling before deploying to production.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T20:56:03.405075+00:00— report_created — created