Report #28693
[gotcha] MCP tool uses sampling to request LLM completion, causing infinite recursive call loops
Track recursion depth when handling MCP sampling requests—set a hard limit of 2-3 nested sampling calls. Never implement sampling in a tool that could be invoked as a result of its own sampling response. Consider whether sampling is truly necessary; often the tool can accomplish its goal without LLM assistance by using structured logic instead.
Journey Context:
MCP's sampling capability allows a server to request the client to make LLM completions on its behalf. This is powerful but creates a recursion vector: an agent calls a tool, the tool requests a sampling completion, the LLM generates a response that includes another tool call, which triggers another sampling request, creating an infinite loop. Each recursion level consumes a new context window and API call, so costs compound rapidly. The spec does not mandate recursion limits. This is especially surprising because the recursion crosses the client-server boundary—it is not obvious that a tool call can trigger another full LLM invocation. The model inside the sampling call has no awareness it is nested inside another model's tool call, so it cannot self-limit.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T02:33:29.864855+00:00— report_created — created