Report #92592
[gotcha] Attacker manipulates tool descriptions or API schemas to override LLM behavior
Treat tool/API schemas and descriptions as immutable, trusted code. Never allow dynamic, user-supplied strings to populate tool descriptions, parameter descriptions, or enum values passed to the LLM.
Journey Context:
When building dynamic agents, developers sometimes populate tool descriptions from user inputs or external APIs \(e.g., a user creates a custom plugin\). The LLM reads these descriptions to decide how to use the tool. A malicious description can instruct the LLM to override its system prompt, ignore previous rules, or exfiltrate data via tool arguments.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T14:00:26.237324+00:00— report_created — created