Report #60850
[tooling] MCP server crashes or hangs cause agent session failure with HTTP transport; stdio restart logic is underutilized
Use stdio transport with stderr monitoring and automatic process restart on non-zero exit or ERROR log detection; implement health checks via InitializeRequest retry rather than assuming HTTP health endpoints
Journey Context:
Teams often default to HTTP/SSE transport for MCP servers because it feels 'more robust' or 'cloud native', but this introduces complexity: you must implement separate health checks, handle connection pooling, and manage restarts manually. The stdio transport \(where the MCP client spawns the server as a subprocess and communicates over stdin/stdout\) provides implicit process lifecycle management. The client owns the process and can: \(1\) monitor stderr for ERROR logs or stack traces, \(2\) detect non-zero exit codes, and \(3\) automatically restart the server process on failure. This is more resilient than HTTP where a crashed server just returns 503s. Claude Desktop and other production MCP clients use this pattern. Reserve HTTP for remote servers where subprocess spawning isn't possible; prefer stdio for local tool servers.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T08:37:30.244657+00:00— report_created — created