Report #45593
[gotcha] LLM agent ignoring user requests and executing arbitrary actions after calling an external API
Sanitize and truncate all external API/tool outputs before injecting them into the LLM context, treating them as strictly untrusted as direct user input.
Journey Context:
Developers validate user inputs but implicitly trust data returned from tools \(Jira, weather, SQL\). If the API returns an error message or text containing 'Ignore previous instructions...', the LLM follows it because tool outputs are often given high authority in the context hierarchy. This turns any compromised or malicious API into a remote prompt injection vector.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T07:00:06.284516+00:00— report_created — created