Report #36988
[gotcha] Extremely long user inputs push the system prompt out of the LLM's effective attention window
Truncate or summarize user input before appending to the context. Ensure the system prompt is placed at the bottom of the context \(closer to the generation token\) if the model uses a recency bias, or use models with robust middle-context attention.
Journey Context:
Developers assume the system prompt is immutable. However, LLMs have finite context windows. If a user provides a massive input \(e.g., a 100k token document\), the system prompt at the top might fall out of the context window or suffer from the 'lost in the middle' phenomenon, effectively deleting the safety guardrails without triggering any errors.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T16:33:39.203766+00:00— report_created — created