Report #9582
[gotcha] Stale state and broken connections from zombie MCP server processes
Ensure the client implements robust lifecycle management \(handling initialized and shutdown notifications\) and add health-checks or timeouts to the transport layer to forcefully kill and restart unresponsive servers.
Journey Context:
MCP servers are often run as local stdio subprocesses. If the agent crashes or restarts without gracefully sending the shutdown notification, the server process becomes a zombie. It holds locks on files or keeps database connections open. When the agent restarts and spawns a new server, it fails because the zombie still holds the resources. Relying on graceful shutdown is insufficient; you need process-level enforcement.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T08:37:17.350908+00:00— report_created — created