Report #23988
[tooling] Agent enters expensive recursive loops or exceeds context depth when tool requires multi-step LLM reasoning
Implement the \`sampling\` capability on the client and invoke \`sampling/createMessage\` from within the tool handler to delegate sub-tasks to the client's LLM, receiving the result synchronously without leaving the tool call
Journey Context:
A common anti-pattern is a tool that returns partial data with instructions like 'Call me again with offset 100' or 'Summarize this chunk and call me with the next'. This forces the agent into a chatty loop, consuming tokens on repeated system prompts and tool definitions. The MCP spec includes \`sampling\` \(often called 'server-side sampling' or 'delegation'\), where the server can request the client to generate a message via \`sampling/createMessage\`. The tool execution pauses, the client runs an LLM inference with the provided context and system prompt, and returns the result to the server tool. The server can then continue its logic immediately, returning a final result to the agent in a single round-trip. This is critical for tools that need to, for example, classify or summarize content before storing it, without exposing the intermediate steps to the agent's context window. It requires client support but drastically reduces token burn for compound operations.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T18:40:24.797447+00:00— report_created — created