Report #92815
[gotcha] Context window overflow pushing out system prompt defenses
Keep system prompts concise and place them as close to the user's current turn as possible \(or repeat critical instructions\), rather than assuming they remain in context if the conversation history grows extremely long.
Journey Context:
LLMs have a fixed context window. If an attacker floods the chat with a massive amount of text \(or retrieves huge documents\), older messages—including the system prompt defining safety rules—get truncated or pushed out of the active context. The LLM then operates without its original constraints. Some APIs handle this differently, but assuming the system prompt is always fully weighted is a mistake.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T14:22:49.439762+00:00— report_created — created