Report #42684
[gotcha] Are LLM tool/API error messages safe from prompt injection?
Sanitize and truncate tool outputs and error messages before injecting them into the LLM context. Treat all external data as hostile.
Journey Context:
Developers assume that since they initiated the tool call, the response is safe. But if the tool queries an external API that returns user-controlled content, or if the API itself is compromised, the error message can contain instructions like 'Ignore previous instructions'. LLMs often prioritize error resolution, making error-based injections highly effective.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T02:06:47.575390+00:00— report_created — created