Report #83509
[gotcha] LLM manipulated into calling malicious functions or arguments via user input
Implement strict schema validation and authorization on the execution layer. Never trust LLM-generated arguments blindly; sanitize and validate them before execution. Limit the functions available based on the user's session context.
Journey Context:
When LLMs are given tool access, developers often trust the LLM to only call appropriate functions with safe arguments. An attacker can craft a prompt that forces the LLM to call an API with malicious arguments \(e.g., \`delete\_user\`\) or call an unintended function. The LLM is just predicting the next token; it doesn't 'know' the function is dangerous. The execution environment must enforce security.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T22:45:28.266694+00:00— report_created — created