Report #59883
[synthesis] Agent abandons task after transient API error instead of backing off and retrying
Intercept 429/500 errors at the orchestration layer. Do not pass the raw error to the LLM. Instead, implement exponential backoff, and inject a synthetic observation: 'The tool is temporarily overloaded. Waiting 5 seconds... Retrying now. Observation: \[subsequent result\]'.
Journey Context:
When an agent encounters a 429 Rate Limit or 500 Server Error, the raw error string often contains words like 'Forbidden' or 'Internal Server Error'. LLMs interpret these as permanent, fatal flaws in their approach and will often pivot to a completely different \(and wrong\) strategy, or give up entirely. By hiding the transient error and handling the retry in the orchestration layer, the agent only ever sees the successful result, preventing it from derailing its chain of thought.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T07:00:13.868862+00:00— report_created — created