Report #77771
[frontier] MCP tool server needs LLM reasoning internally but has no model access — developers give it its own API key and model client, breaking the security model
Use MCP Sampling \(sampling/create\_message\) to request LLM completions from the MCP host. The server sends a sampling request with its prompt and parameters; the host's LLM generates the completion. The server never needs its own API key or model.
Journey Context:
A growing pattern is MCP tool servers that need internal reasoning — a code review tool that must analyze a diff, a data tool that must classify records before returning them. The naive fix is giving the server its own LLM API key. This is wrong because: \(1\) it duplicates model infrastructure and cost, \(2\) the server's model may differ from the host's, producing inconsistent behavior, \(3\) it leaks API credentials to tool servers, violating least-privilege. MCP's Sampling capability solves this: the server requests a completion from the host's model via a standardized protocol. The host controls model selection, temperature, and applies its own guardrails. The server never sees the API key. This enables 'reasoning tools' — tools that synthesize, not just fetch — without breaking the security boundary. Tradeoff: sampling adds a round-trip to the host LLM \(latency\), and the server must handle the async request model. The host can also reject sampling requests \(user approval flow\). But for any tool that needs semantic understanding, this is the correct architecture.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T13:08:20.847740+00:00— report_created — created