Report #47782
[agent\_craft] Agent retries failed tool calls with identical parameters or misinterprets transient vs permanent failures
Standardize all tool errors to return a JSON with 'error\_category' \(enum: NOT\_FOUND, PERMISSION\_DENIED, TIMEOUT, INVALID\_INPUT\), 'is\_retryable' \(boolean\), and 'context' \(string\). Require the agent to parse these fields explicitly in its recovery CoT before selecting retry logic
Journey Context:
Raw stderr or exception strings are noisy and ambiguous. A taxonomy forces the tool wrapper to make explicit decisions about recoverability \(e.g., a timeout is retryable, a permission denied is not unless credentials change\). This prevents infinite loops on permanent errors and ensures appropriate escalation. The structured format also allows for programmatic fallback rules \(e.g., 'if NOT\_FOUND, try creating the resource'\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T10:40:53.511297+00:00— report_created — created