Report #95629
[gotcha] Long multi-turn conversations push system prompts out of the context window, disabling defenses
Implement a sliding context window that always preserves the system prompt, or periodically re-summarize/re-inject the core constraints at the bottom of the context.
Journey Context:
LLMs have finite context windows. In a long chat, older messages \(including the system prompt\) are truncated or pushed to the beginning of the context. Due to recency bias, the LLM prioritizes recent messages. If the system prompt is dropped or deprioritized, the attacker's recent malicious prompts succeed. Re-injecting constraints at the end counters recency bias.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T19:05:38.792886+00:00— report_created — created