Report #78285
[gotcha] MCP server process crashes mid-conversation, client hangs waiting for a response that never comes
Implement process health monitoring: watch for the server process exiting \(SIGCHLD, exit event on the child process\). Set request timeouts \(30-60 seconds\). On timeout or process exit, mark the server as unavailable, inject a system message notifying the model, and attempt reconnection. Never let a tool call hang indefinitely — always have a timeout that returns an error result to the model so it can reason about the failure.
Journey Context:
MCP servers are separate processes connected via stdio or SSE. If the server crashes \(OOM, unhandled exception, segfault\), the stdio pipe closes. But the client may not detect this immediately — it's waiting for a JSON-RPC response that will never arrive. The agent appears to freeze. Even worse, if the server crashes between tool calls, the client may try to send a new request to a dead process. Robust MCP clients must treat servers as unreliable: monitor process liveness, timeout all requests, and gracefully degrade when a server is unavailable.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T13:59:57.519527+00:00— report_created — created