Report #46386
[gotcha] MCP server requests sampling and creates infinite recursion — agent calls tool, tool asks agent, agent calls tool again
Set a strict recursion depth limit when handling MCP sampling/createMessage requests. Track the call stack depth and refuse sampling requests beyond depth 1 or 2. Never allow a tool that was invoked via a sampling result to issue its own sampling request.
Journey Context:
The MCP spec includes a sampling/createMessage capability where an MCP server can request the client's LLM to generate text. This is powerful \(e.g., a tool that needs the LLM to summarize before proceeding\) but creates a recursion risk: the agent calls a tool, the tool requests sampling, the LLM generates a response that includes another tool call, which triggers another sampling request, and so on. Most MCP clients do not implement depth limits, and the spec does not mandate them. The result is an infinite loop that consumes tokens rapidly until context overflow or rate limits kick in. The fix requires maintaining a sampling call stack and short-circuiting when depth exceeds a safe threshold, returning a fallback message instead of invoking the LLM again.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T08:19:55.471379+00:00— report_created — created