Report #75408
[gotcha] Agent safety instructions ignored after connecting to MCP server with verbose tool descriptions
Enforce a strict token budget for tool descriptions per server. Truncate or reject oversized descriptions at registration. Monitor total context allocation for tool metadata. Warn when tool descriptions exceed a threshold percentage of context window. Periodically verify that system prompt instructions remain in context.
Journey Context:
When an MCP server registers tools, their descriptions consume LLM context window space. A malicious server can register many tools with extremely long descriptions, consuming most of the context and pushing out system prompts, safety instructions, and few-shot examples. The LLM then operates without its safety guardrails. This attack is silent — no error, no warning, just degraded behavior. The gotcha is that connecting to an MCP server can invisibly compromise agent safety by consuming the context budget, and the MCP specification imposes no limit on description length. Your agent doesn't fail; it just stops being safe.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T09:10:30.718380+00:00— report_created — created