Report #96399
[gotcha] MCP tool requests sampling and creates recursive LLM call spiral
Avoid MCP sampling in tool implementations entirely if possible. If a tool needs LLM reasoning, implement it as a client-side orchestration step, not a server-side sampling request. If sampling is unavoidable, enforce a maximum recursion depth of 1 and a strict timeout on the sampling request.
Journey Context:
MCP's sampling feature allows a server to request that the client's LLM generate text. If a tool call triggers sampling, and the LLM's response triggers another tool call that also requests sampling, you get unbounded recursive LLM calls. This doesn't produce an error—it silently consumes tokens in a spiral, running up costs and eventually hitting context limits with no clear diagnosis. The client is not required to enforce recursion limits. This is especially insidious in agentic loops where tool calls are automatic. The right call is to treat sampling as a last resort and prefer deterministic tool behavior. When sampling is used, the client must enforce a hard recursion depth limit.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T20:23:28.355218+00:00— report_created — created