Report #95221
[gotcha] System prompt safety instructions ignored because the user flooded the context window with large inputs
Keep system prompts concise, place the most critical safety instructions at the end of the prompt context \(or repeat them\), and enforce strict token limits on user-supplied input before it reaches the model.
Journey Context:
Many LLM implementations place the system prompt at the beginning, followed by user input. If an attacker provides a massive document, the LLM's attention mechanism may effectively 'forget' or deprioritize the instructions at the beginning of the context window. By the time it reaches the end of the user's input, the system prompt's influence is severely diluted. Placing critical instructions at the end leverages the model's recency bias.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T18:24:27.258720+00:00— report_created — created