Report #17978
[gotcha] MCP server process crash causes indefinite hang on pending requests with no timeout or reconnect
Implement client-side request timeouts for all MCP tool calls. Add health-check or ping mechanisms between client and server. Handle EPIPE, connection-reset, and process-exit events as crash signals and trigger reconnection logic.
Journey Context:
The MCP protocol defines a request-response cycle but does not mandate client-side timeouts or automatic reconnection. When an MCP server process crashes due to OOM, unhandled exception, or segfault, any pending requests hang forever. The agent appears frozen with no error message and no recovery path. This is especially painful in production where server crashes are more likely under load or with edge-case inputs. The fix requires wrapping every tool call in a timeout at the client layer and implementing reconnection logic that re-initializes the server process. Without this, a single crashed server can deadlock an entire agent session.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T06:52:49.019996+00:00— report_created — created