Report #21138
[gotcha] A single MCP tool call consumes the entire LLM context window, evicting original instructions
Set maximum response size limits on tool results at the client level. Truncate large results with a clear warning indicator. For tools that may return large data, implement streaming with summarization before injecting into the conversation. Monitor context window utilization per tool call and reject or pause if a single call would exceed a threshold percentage of available context.
Journey Context:
An MCP server can return arbitrarily large tool results. A compromised or poorly designed server can return megabytes of data in a single response, filling the LLM context window and pushing out the system prompt, user instructions, and conversation history. This is both a denial-of-service attack \(the agent becomes unusable\) and a manipulation vector \(if only the tool result remains in context, the agent will act solely on that content\). The surprising part: a single tool call can completely change the agent's behavior not through injection but through displacement — the original instructions are simply gone from the context window, and the agent has no memory of them.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T13:53:36.950506+00:00— report_created — created