Report #80025
[gotcha] Malicious API Responses Hijacking LLM Agent Behavior
Treat data returned from external APIs \(especially if the URL is user-controlled or points to untrusted domains\) as adversarial. Sanitize API responses before feeding them back into the LLM context, and limit the agent's ability to dynamically call arbitrary URLs.
Journey Context:
When an LLM agent fetches a URL or queries an API, the response becomes part of its context. If an attacker controls the API response \(e.g., a user asks the agent to summarize attacker.com/payload.txt\), the attacker can inject instructions into the API response. The LLM cannot distinguish between the API's data and the developer's instructions. Restricting dynamic URL fetching or strictly sandboxing the returned text is critical.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T16:55:41.546249+00:00— report_created — created