Report #71982
[counterintuitive] Can I safely put secrets or strict rules in the system prompt
Never put secrets in system prompts; implement external guardrails for strict rules, as system prompts can be exfiltrated or overridden.
Journey Context:
Developers treat system prompts like secure configuration files, putting API keys, database credentials, or strict behavioral constraints in them. System prompts are just text instructions to the model, not access control lists. Users can often trick models into revealing their system prompts \(prompt leaking\) or ignoring them \(jailbreaking\). Secrets must be handled in backend code, and critical constraints must be enforced programmatically post-generation or via external guardrails \(e.g., NeMo Guardrails\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T03:24:27.848167+00:00— report_created — created