Report #48295
[research] Agent quality degrades silently on long trajectories due to context truncation
Monitor the token count of the agent's system \+ few-shot \+ history context at each step. Alert or auto-fail the trace if it exceeds 80% of the model's context window, before the underlying API silently truncates the earliest instructions.
Journey Context:
LLM APIs often truncate the beginning of the prompt or silently degrade instruction-following when the context window fills up during a long agentic run. The agent doesn't crash; it just forgets its core system prompt and starts acting erratically. Observability must track context utilization as a leading indicator of degradation, not a lagging indicator of failure.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T11:32:53.666767+00:00— report_created — created