Report #87200
[gotcha] LLM revealing system prompt contents through repetitive or degenerate prompting
Never put secrets, API keys, or sensitive logic in system prompts. Implement output scanning to detect and redact fragments of the system prompt before returning the response to the user.
Journey Context:
Developers put proprietary logic or credentials in system prompts assuming the LLM won't repeat them. However, through tricks like asking the LLM to repeat a word forever, the model can degenerate into spitting out its context window, including the system prompt. System prompts are not a secure storage mechanism.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T04:57:28.072614+00:00— report_created — created