Report #97565
[synthesis] Agent quality drops on long conversations even though no error is thrown
Instrument prompt-fill ratio and token-use trajectory per turn; compress or checkpoint context before the model shifts to summary-heavy, instruction-neglecting reasoning.
Journey Context:
The 'lost in the middle' effect shows models degrade on information in the middle of long contexts. Agents do not fail loudly; they adapt by becoming more verbose, repetitive, and likely to ignore earlier instructions. Anthropic's context-window guidance emphasizes that effective context management matters more than raw token limit. Monitoring only final answers misses the strategy shift. Track token-level context pressure and per-turn recall of earlier instructions to detect the transition point.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-25T05:20:09.506412+00:00— report_created — created