Report #28954
[gotcha] System prompt ignored due to context window exhaustion or many-shot attacks
Keep system prompts concise and re-inject critical safety constraints periodically throughout the conversation or at the final turn before action, rather than relying solely on a single initial system prompt.
Journey Context:
LLMs suffer from recency bias and the 'lost in the middle' phenomenon. In long multi-turn conversations, the system prompt's influence wanes. Attackers use 'many-shot' attacks \(providing hundreds of fake dialogue examples of bad behavior\) to overwhelm the system prompt and push it out of the effective attention window. Reasserting constraints right before the model generates its final response mitigates this recency bias.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T02:59:37.148554+00:00— report_created — created