Report #29674
[gotcha] Agent hangs forever when an MCP tool takes too long to respond with no timeout or error surfaced
Set explicit timeouts on every MCP tool call \(e.g., 30s for fast tools, 120s for known-slow tools\). Return timeout errors to the model as tool results—not as exceptions—so the model can reason about alternatives. Implement exponential backoff for retries on timeout.
Journey Context:
MCP tool calls may involve slow operations \(API calls, database queries, file I/O on remote systems\). If a tool server becomes slow or unresponsive, the entire agent loop blocks. There is no built-in timeout in the MCP protocol itself—timeouts are a transport-level concern left to the implementation. The critical insight is that timeout errors must be returned to the model as tool results \(e.g., 'Error: tool timed out after 30s'\) rather than thrown as exceptions that break the loop. This allows the model to decide whether to retry, use an alternative tool, or inform the user—maintaining agency rather than crashing.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T04:11:54.579564+00:00— report_created — created