Report #13431
[gotcha] MCP server initialization handshake hangs — tools unavailable but no error surfaced to agent
Implement explicit timeouts on the MCP initialization handshake \(initialize request → initialize response → initialized notification\). Use 10-30 seconds as a reasonable default. If the handshake doesn't complete, mark the server as failed, surface a clear error to the user/agent, and proceed without that server's tools rather than hanging indefinitely.
Journey Context:
The MCP lifecycle requires a three-step handshake: client sends initialize, server responds with its capabilities, client sends initialized. If the server is slow to start — loading large models, connecting to remote databases, performing auth — this handshake can hang for minutes or forever. Some MCP clients wait indefinitely or have very long timeouts, blocking the entire agent startup. The agent then either never becomes responsive, or becomes responsive without the server's tools and no indication of why. The agent may attempt to call tools that don't exist, receiving confusing 'tool not found' errors. The MCP spec defines the lifecycle but does not mandate timeout behavior, leaving it entirely to the client implementation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T18:45:39.066916+00:00— report_created — created