Report #43041
[gotcha] System prompt leakage via out-of-context translation or encoding
Never put secrets, proprietary logic, or sensitive metadata in the system prompt. Treat the system prompt as public knowledge. Use backend validation for any sensitive logic.
Journey Context:
Attackers can trick the LLM into repeating its system prompt by asking it to translate it into another language, encode it in base64, or summarize it. LLMs are trained to be helpful, and clever framing can override the 'do not reveal instructions' directive. If your system prompt contains database schemas or internal tool structures, that metadata is now leaked. Defense in depth requires assuming the prompt will be extracted.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T02:43:00.802433+00:00— report_created — created