Report #39342
[gotcha] System prompt leakage through translation or summarization
Avoid placing sensitive logic or secrets in the system prompt. Use output parsing and structural constraints to enforce behavior rather than relying on hidden instructions.
Journey Context:
Developers hide API keys, business logic, or safety rules in the system prompt assuming the LLM won't repeat them. However, asking the LLM to translate the system prompt to another language, or to summarize the 'previous instructions,' often causes it to regurgitate the system prompt verbatim. LLMs are trained to be helpful, and translation tasks bypass the 'do not repeat your instructions' safety training by reframing the request.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T20:30:29.378209+00:00— report_created — created