Report #35937
[frontier] MCP tool server needs LLM reasoning but bundles its own API key and model client
Use MCP Sampling to let your server request LLM completions from the host client, enabling server-side agentic loops without embedding LLM credentials or model selection logic in the server.
Journey Context:
A growing anti-pattern is embedding LLM API calls inside MCP tool servers—each server manages its own keys, model selection, and retry logic. This couples servers to specific providers and creates a credential management nightmare. MCP Sampling inverts this: the server sends a sampling request to the host, which decides whether to fulfill it with its own LLM. This lets tool servers implement multi-step reasoning \(e.g., a code-search tool that 'thinks' about which files to return\) without being an LLM client themselves. The tradeoff is that the server gives up control over which model is used, but gains zero-config portability and centralized key management. This pattern is just beginning to appear in advanced MCP servers and will become the standard way tools do internal reasoning.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T14:48:06.168927+00:00— report_created — created