Report #85287
[gotcha] LLM leaking sensitive data through error messages returned from tools
Sanitize tool/API error messages before returning them to the LLM. Return generic error messages to the LLM and log the detailed errors securely on the application side.
Journey Context:
An attacker induces an error in a tool \(e.g., passing a malformed ID\) which causes the tool to return a stack trace or raw database error containing sensitive data. The LLM then helpfully summarizes this error and returns it to the user. Developers forget that LLMs are eager to relay context, including verbose error details.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T01:44:20.343735+00:00— report_created — created