Report #63732
[gotcha] Sensitive data leaking through tool error messages
Sanitize all tool error messages before returning them to the LLM. Return generic failure codes to the agent, and log the detailed stack traces and sensitive context securely on the server side.
Journey Context:
When a tool fails \(e.g., a database query fails due to a syntax error\), the underlying library often returns a verbose error message containing the full SQL query, connection strings, or even PII. The LLM reads this error and may summarize it for the user, or if compromised via indirect injection, exfiltrate the error contents. Developers leave verbose errors on for debugging, forgetting the LLM will read and process them.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T13:27:45.657550+00:00— report_created — created