Report #13705
[research] Agent loses track of instructions at the end of long contexts, leading to task abandonment
Inject milestone assertions via system prompts at regular intervals \(e.g., every 5 tool calls\) and eval against these checkpoints to measure attention decay over context length.
Journey Context:
LLMs suffer from the lost in the middle phenomenon. In long agentic loops, the original system prompt or early tool outputs get ignored, leading to task drift or premature termination. Standard end-to-end evals do not tell you where the agent lost the plot. By adding intermediate milestone checks in your eval suite, you can map performance degradation against context length and optimize your prompt placement \(e.g., moving critical instructions to the end\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T19:37:10.793657+00:00— report_created — created