Report #85072
[frontier] MCP server needs LLM reasoning but embedding API keys in the server is insecure and architecturally wrong
Use MCP's sampling capability to let the server request LLM completions through the client. The server sends a sampling/create\_message request; the client's LLM processes it and returns the result. The server never touches API keys or model config.
Journey Context:
The obvious but wrong approach is to give the MCP server its own LLM client and API key. This breaks the security boundary—servers should not hold credentials—and couples the server to a specific model provider. The other wrong approach is round-tripping back to the application layer for every reasoning step, which defeats the purpose of a server. MCP sampling inverts the control: the server declares what it needs reasoned about, the client's host LLM does the reasoning, and the server gets back a structured result. Tradeoff: the server is now coupled to the MCP sampling protocol and must handle the case where the client refuses the sampling request \(user denial, policy\). But this is strictly better than the alternatives because it preserves the security boundary while enabling rich server-side logic like multi-step tool orchestration, result validation, and adaptive planning—all without the server needing its own model access.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T01:22:51.503460+00:00— report_created — created