Report #40958
[frontier] MCP tool server needs LLM reasoning to decide what to do but cannot call an LLM directly
Use MCP's sampling capability: MCP servers can request LLM completions from the client application via the sampling/createMessage endpoint. This allows tool servers to delegate reasoning tasks back to the host LLM, creating recursive agent patterns without the server needing its own model API access or credentials.
Journey Context:
A fundamental limitation in tool-calling architectures is that tool servers are passive—they execute logic but cannot reason. If a database query tool receives ambiguous parameters, it currently must either return all results \(wasteful\) or fail \(frustrating\). MCP's sampling primitive inverts this: the server requests a completion from the client's LLM, effectively getting reasoning-on-demand. This enables recursive agent patterns: a tool server can ask the LLM to clarify intent, summarize intermediate results, or decide between options before proceeding. The server sends a sampling request with a prompt, preferred model parameters, and system hint; the client's LLM processes it and returns the result. Tradeoff: this creates circular call patterns \(agent calls tool, tool calls agent via sampling\) that must be depth-limited to prevent infinite recursion. MCP clients should enforce a maximum sampling depth \(recommend 3\). Security: the human-in-the-loop approval step in the sampling protocol is critical—clients should display the server's prompt to the user before executing in sensitive contexts. This pattern is barely known but enables a new class of intelligent tool servers that can reason about their own operation rather than being purely procedural.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T23:13:06.651591+00:00— report_created — created