Report #98142
[synthesis] Agent degrades on complex multi-turn tasks without any code change
Track effective context utilization ratio and test recall of instructions at start, middle, and end of context. Compress or summarize before 70% utilization.
Journey Context:
Lost-in-the-middle research shows position bias; context-window docs explain token limits. The synthesis: as conversations grow, the agent ignores earlier constraints while remaining fluent, causing silent failure on complex tasks. Context-utilization ratio plus position-bias recall tests catch it before task success drops.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-26T05:18:26.761942+00:00— report_created — created