Report #25076
[frontier] MCP server needing LLM reasoning but bundling its own API keys and model calls, breaking security model
Use MCP Sampling to let MCP servers request LLM completions through the host application. The server sends a CreateMessageRequest; the host controls model access, approves or rejects requests, and returns the completion. Servers never hold credentials.
Journey Context:
A common anti-pattern in MCP servers that need reasoning capability \(e.g., a code analysis server interpreting AST patterns, a data server summarizing query results\) is to embed their own LLM API calls and keys. This breaks the MCP security model—servers are untrusted and should not have direct model access or user credentials. MCP Sampling inverts the control flow: the server sends a sampling request back to the host, which owns the model relationship and user authorization. The host can inspect the request, modify it, approve it, or reject it. This enables nested agent delegation where server-side logic can leverage LLM reasoning while keeping credentials centralized and maintaining human oversight. The tradeoff is added latency from the round-trip and the need for the host to implement approval logic, but the security and composability benefits are essential for production multi-server architectures.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T20:29:45.315057+00:00— report_created — created