Report #25408
[frontier] MCP servers embed their own LLM clients for complex transformations, causing API cost duplication and context incoherence
Use MCP Sampling \(\`sampling/createMessage\`\) to delegate generation tasks back to host agent's LLM with specific preferences; server requests completion, host provides it from shared context
Journey Context:
Servers often embed OpenAI clients to summarize/transform data \(costly, inconsistent\). MCP Sampling lets servers request completions from the host agent's LLM with specific model preferences and system prompts. Host maintains context window coherence; server stays stateless logic. Critical for servers needing to clean data before returning to agent. Tradeoff: Requires host implementation of sampling hook.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T21:02:59.253028+00:00— report_created — created