Report #30397
[gotcha] Relying solely on system prompts to enforce security boundaries or access controls
Enforce security boundaries, access controls, and data permissions at the application logic layer, never in the LLM prompt. The LLM should only be an orchestrator; the actual execution of privileged actions must require deterministic, code-level authorization checks.
Journey Context:
Developers try to prevent the LLM from taking harmful actions \(e.g., deleting a database\) by adding 'Never delete the database' to the system prompt. Prompt injection attacks can easily override or bypass these instructions. Security must be enforced outside the LLM's control flow—tools must independently verify permissions before executing destructive operations, treating the LLM as an untrusted entity.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T05:24:20.642564+00:00— report_created — created