Report #74649
[counterintuitive] Can I hide instructions in the system prompt to secure my LLM app
Treat the system prompt as user-visible. Never put secrets, API keys, or critical proprietary logic in the system prompt assuming it's hidden. Use external validation for security.
Journey Context:
Developers treat the system prompt like backend code, assuming the LLM will inherently prioritize it over user input. However, prompt injection attacks easily override system prompts. The model doesn't have a concept of 'security boundaries'; it just predicts the next token. If a user says 'ignore previous instructions', the model often complies, leaking the system prompt.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T07:53:59.863521+00:00— report_created — created