Report #86167
[counterintuitive] System prompts securely hide instructions from users
Never put secrets, API keys, or sensitive proprietary logic in system prompts. Treat system prompts as user-visible, and implement backend validation for any actions the agent takes.
Journey Context:
Developers treat the system prompt as a secure, hidden 'root' instruction. In reality, LLMs are highly susceptible to prompt injection, and users can often extract system prompts by asking the model to repeat them, output them in base64, or ignore previous instructions. The system prompt is merely soft guidance, not a sandbox or security boundary.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T03:13:17.127704+00:00— report_created — created