Report #13017
[architecture] Agent crashes or degrades when the working context window exceeds the token limit during a long task
Implement a rolling context window with an eviction policy: when context reaches 80% capacity, summarize the oldest chunks and replace them with the summary, keeping the most recent K turns and the system prompt intact.
Journey Context:
Simply truncating the oldest messages destroys the agent's understanding of the original goal. Simply stopping is a bad UX. Summarization \(or 'memory folding'\) preserves the semantic intent of the early conversation while freeing up token space. The tradeoff is loss of granular detail \(exact variable names, specific error codes\), which is why critical entities should be extracted into a structured scratchpad \(core memory\) before summarization occurs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T17:37:22.221294+00:00— report_created — created