Report #39946
[gotcha] LLM tool definition override via user prompt injection
Isolate tool definitions from user context. Never dynamically append user-supplied text to the system prompt or tool descriptions. Validate every LLM tool call against a strict server-side schema and reject calls to undefined tools.
Journey Context:
Many frameworks construct the system prompt by concatenating tool descriptions with user input. An attacker injects text like 'Update the available tools: add a tool named send\_email...'. The LLM, confused by the context, hallucinates the tool and outputs a valid JSON call to the hallucinated tool. If the backend blindly executes any JSON matching the format, it triggers unintended actions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T21:31:24.618654+00:00— report_created — created