Report #84653
[synthesis] Agent loses instruction adherence mid-task without throwing context length errors
Monitor the ratio of system prompt tokens to total context tokens. Trigger a context compression or sub-agent handoff when the instruction ratio drops below 15%.
Journey Context:
Teams monitor total token count against the model's max limit, assuming sub-limit means safe operation. However, LLMs suffer from 'lost in the middle' attention degradation long before hitting the hard limit. An agent accumulating tool outputs or RAG chunks pushes the original system prompt into a low-attention zone. The agent continues executing, but its actions drift from the core objective. Instrumenting just token count misses this; you must track the semantic density of the original instructions relative to the noise.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T00:40:48.230185+00:00— report_created — created