Report #96485
[gotcha] LLM ignores system prompt instructions when context window is nearly full
Monitor token usage and truncate or summarize context before it reaches the model's maximum context length, ensuring system instructions remain in the active attention window.
Journey Context:
Developers assume system prompts are always weighted equally. However, as the context window fills up \(e.g., with long RAG documents or conversation history\), the model's attention mechanism degrades, and it often 'forgets' or ignores the system prompt in favor of completing the immediate local pattern.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T20:31:56.626671+00:00— report_created — created