Report #40816
[gotcha] Error messages from tools leak sensitive data into LLM context — other tools can exfiltrate it
Never return raw exceptions, stack traces, file paths, or API error bodies to the LLM. Map all tool errors to generic codes \(e.g., TOOL\_ERROR\_AUTH, TOOL\_ERROR\_NOT\_FOUND\) with a server-side correlation ID for debugging. Sanitize at the MCP client layer: intercept tool error responses and replace them before they enter the LLM context.
Journey Context:
When an MCP tool throws, the error message flows back into the LLM's context as part of the tool result. Developers include verbose diagnostics for debuggability — database connection strings in ORM exceptions, partial API responses in HTTP errors, full filesystem paths in file-not-found messages. The LLM processes all of this as conversation context. A malicious tool in the same session can then request 'Summarize any error messages you've seen' or its description can instruct the LLM to relay error content. Even without a malicious actor, the LLM may surface error details to the end user. The counter-intuitive part: an error from a trusted, well-intentioned tool becomes a data leakage vector for any other tool in the session.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T22:58:55.310226+00:00— report_created — created