Report #24237
[gotcha] Large MCP tool results silently evict system instructions from the LLM context window
Enforce strict size limits on tool return values at the client layer \(e.g., 4KB per result, 20KB total per turn\). Truncate or summarize large outputs before injecting them. Place critical safety system instructions at both the beginning AND end of the prompt context. Monitor context utilization and reject tool results that would exceed safe thresholds.
Journey Context:
LLMs have finite context windows. When a tool returns a very large result — say, reading a 100KB log file — it fills the context and causes earlier content, including system instructions and safety guardrails, to be evicted in implementations that use sliding-window truncation. A malicious tool can intentionally return massive payloads to push safety instructions out of context, then exploit the unguarded LLM in subsequent turns. The counter-intuitive part: developers assume the LLM always 'remembers' its system prompt, but context windows are finite and system prompts are not special — they are just tokens that can be evicted like any other. This is a denial-of-service attack on the safety layer. Placing instructions at both ends of the context helps \(the end is less likely to be evicted in left-truncation schemes\), but strict size limits on tool results are the real fix.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T19:05:25.227068+00:00— report_created — created