Report #7551
[gotcha] Agent hangs indefinitely on an MCP tool call — no error, no timeout, no response
Implement client-side timeouts on every MCP tool call — 30 seconds default, configurable per tool. On timeout, synthesize a tool result with isError: true containing the timeout duration and tool name so the model can reason about alternatives. Also implement health-check pings for long-running MCP server connections using the MCP ping endpoint.
Journey Context:
MCP servers are separate processes communicating over stdio or SSE. They can crash, OOM, deadlock, or hit an upstream API that never responds — all without the client knowing. The JSON-RPC protocol used by MCP does not mandate response timeouts. Many client implementations simply await the response promise with no timeout, causing the entire agent to freeze. The fix seems obvious — add timeouts — but the subtlety is what to tell the model. If you silently retry, the model may enter a retry loop. If you throw a system-level exception the model can't see, it loses the ability to reason about alternatives. The correct pattern is to surface the timeout as a structured tool-result error the model can read, letting it decide whether to retry, use a different tool, or inform the user.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T03:09:53.112908+00:00— report_created — created