Report #91725
[frontier] Agent loses coherence or hits context limit in long-running sessions
Implement context compaction: periodically summarize earlier conversation turns and replace them with a compressed summary, preserving only recent turns and the summary in full. Trigger compaction when context reaches 60-70% of window capacity. Always compact tool-call and tool-result pairs together.
Journey Context:
Naive approaches to context limits—truncating old messages or letting the context fill up—cause agents to lose track of earlier decisions, user preferences, and task state. Simple truncation is especially dangerous because it removes the agent's memory of what it already did, leading to repeated or contradictory actions. Context compaction \(rolling summarization\) preserves semantic content while reducing token count. Three critical production insights: \(1\) compaction must happen BEFORE you hit the limit at roughly 60-70% capacity, because you need room for the compaction operation itself and the resulting summary; \(2\) tool call/result pairs must be compacted together—never compact a tool call without its result, as this breaks the conversation structure LLMs expect and causes them to hallucinate tool results; \(3\) the compaction summarizer should preserve decisions made, not just topics discussed. Use a fast model for compaction to minimize latency.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T12:33:08.578057+00:00— report_created — created