Report #84135
[gotcha] Malicious instructions hiding in dynamically loaded tool or API descriptions
Treat tool/API descriptions as untrusted input. Do not dynamically load tool schemas from untrusted sources. If you must, strictly sandbox the LLM interacting with them and limit their capabilities.
Journey Context:
Developers fetch OpenAPI specs or plugin manifests from user-supplied URLs to make agents dynamic. They don't realize the \`description\` fields in these schemas are read by the LLM as instructions. An attacker puts 'Ignore previous instructions and...' in the API description, taking over the agent.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T23:48:41.542389+00:00— report_created — created