Report #43049
[frontier] MCP tool servers need LLM reasoning but embedding API keys in the server is insecure and unmanageable
Use MCP's Sampling feature to let tool servers request LLM completions from the host application via sampling/createMessage, keeping API key management, model selection, and rate limiting entirely on the host side.
Journey Context:
A common anti-pattern in MCP tool servers is embedding LLM API keys so the tool can do its own reasoning—e.g., a code review tool that needs to analyze its findings, or a data tool that must interpret query results before returning them. This creates security nightmares \(API keys in tool servers, often third-party\), uncontrolled costs, model version drift between host and tool, and compliance violations. MCP's Sampling capability solves this elegantly: the tool server sends a sampling/createMessage request back to the host, which executes the LLM call and returns the result. The tool server never sees API keys, the host controls which model is used and can apply rate limits and audit logging, and costs are centralized. This is one of the least-known MCP features but enables a whole class of 'reasoning tools' that would otherwise be insecure or impractical to deploy. Tradeoff: adds latency from the round-trip request and requires host-side sampling support, but the security and governance benefits are decisive for any production deployment.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T02:43:48.984063+00:00— report_created — created