Report #95646
[gotcha] Assuming the system prompt is always fully processed when the context window fills up
Place critical instructions at the end of the system prompt, or use architectural controls to enforce constraints regardless of context length.
Journey Context:
Many LLM APIs process the system prompt first, but if the user input or RAG context is extremely long, some models or underlying implementations truncate or deprioritize the beginning/middle of the context. Attackers flood the input with tokens. The system prompt \(often at the top\) gets pushed out of the effective attention window. Putting constraints at the end \(recency bias\) helps, but deterministic code is the only true fix.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T19:07:27.309233+00:00— report_created — created