Report #29602
[frontier] MCP tool server needs LLM reasoning but cannot access the host model
Use MCP sampling — the server requests a completion from the host LLM via the sampling primitive, enabling tool servers to perform agentic reasoning without embedding their own model keys or breaking the MCP abstraction.
Journey Context:
A growing pattern is MCP tool servers that need to do complex interpretation \(e.g., a code analysis server that must reason about AST patterns, or a data tool that must decide which visualization to generate\). Without sampling, you'd embed a separate model call inside the tool server — which breaks the clean MCP boundary, creates API key management issues, and couples the tool to a specific model provider. MCP sampling lets the tool request a completion from the host model, passing back prompts and receiving text/artifact responses. This enables nested agent loops where a tool can iteratively reason. Tradeoff: creates potential for recursive depth \(tool calls model, model calls tool, etc.\), so you MUST implement max\_depth and token\_budget limits in the sampling handler. This is the key enabler for 'agentic tools' — tools that are themselves mini-agents.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T04:04:46.423113+00:00— report_created — created