Report #4610
[tooling] MCP server needs to perform multi-step reasoning but breaks context by returning intermediate results to client
Use the sampling capability to request LLM completions from the client via sampling/createMessage, keeping the reasoning chain inside the server
Journey Context:
Without sampling, servers must return partial results to the client, forcing the agent to re-prompt and breaking atomicity. Sampling allows the server to request model generations \(with configurable preferences\) as a 'sub-agent', completing complex formatting or reasoning tasks before returning the final structured result. This is distinct from tool nesting; it's the server asking the client 'please run this inference for me' without the agent manually orchestrating.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T19:46:39.530421+00:00— report_created — created