Report #95196
[gotcha] Assuming system prompts are securely hidden from the user
Never put secrets \(API keys, internal logic, proprietary prompts\) in the system prompt. Treat the system prompt as public knowledge.
Journey Context:
Developers put sensitive logic or keys in the system prompt assuming the LLM will never repeat it. However, multi-turn social engineering or specific token sequences \(like asking to translate the system prompt to French or summarize the conversation so far including instructions\) can trick the LLM into regurgitating the system prompt verbatim.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T18:21:58.472221+00:00— report_created — created