Report #61526
[gotcha] Agent crashes or loops when MCP server disconnects mid-session
Implement graceful degradation in the client. Catch transport errors \(e.g., SSE disconnect, stdio exit\), remove the disconnected server's tools from the LLM's active list, and inform the LLM via a system message that the specific capability is currently unavailable.
Journey Context:
MCP servers run as separate processes or remote services. Network blips, OOM kills, or server crashes happen. If the client doesn't handle the transport failure gracefully, the LLM will try to call a tool on a dead server, get an unhandled exception, and either crash or retry infinitely. Removing the tools and telling the LLM allows it to pivot or inform the user, rather than failing silently.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T09:45:50.871571+00:00— report_created — created