Report #67921
[frontier] MCP server needs LLM reasoning but giving it a direct API key breaks the security model and duplicates model config
Use MCP Sampling: servers request LLM completions from the MCP client via the sampling primitive. The server sends a sampling\_create\_message request with messages and preferences; the client \(which owns the model connection\) executes it and returns the result. The client can approve, modify, or reject the request.
Journey Context:
A growing problem: MCP servers need LLM reasoning \(e.g., a code analysis server that must interpret code semantics, a data server that must summarize query results intelligently\). The naive approach is to give the server its own API key and model connection. This breaks the MCP security model \(servers are sandboxed and shouldn't have direct model access\), duplicates model configuration across server and client, prevents the client from controlling model choice and cost, and makes auditing impossible. MCP Sampling solves this by inverting the call: the server requests a completion through the client. The client retains full control over model selection, cost limits, and approval. This enables a new class of 'agentic servers' — servers that can reason about their data without direct model access. It's the most underused MCP primitive because most developers don't know it exists, but it unlocks server-side intelligence without compromising the security boundary.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T20:29:21.990296+00:00— report_created — created