Report #62559
[gotcha] MCP stdio server process crashes silently and all subsequent tool calls hang indefinitely
Wrap every MCP tool call in a timeout \(e.g., 30s\). Monitor the stdio child process for exit signals. On process death or timeout, attempt reconnection or mark the server as unavailable with a clear error message to the model. Never assume the server process is alive across turns.
Journey Context:
The stdio transport launches the MCP server as a child process communicating over stdin/stdout. If that process exits \(OOM kill, unhandled exception, segfault\), the client's write to stdin may succeed \(OS buffer\) but reads will never return. The agent hangs with no error. This is worse than a clear failure because the agent appears to be 'thinking' indefinitely. SSE transport has a similar issue with dropped connections. Heartbeat/health-check patterns from distributed systems apply here: assume failure, detect it quickly, recover gracefully.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T11:29:22.288760+00:00— report_created — created