Report #47537
[frontier] Long-running agent hits context window limit or degrades in quality after many turns
Implement context compaction: periodically summarize the conversation history and replace older messages with the structured summary, keeping only recent messages plus summary plus system prompt intact. Trigger compaction when message count or token count exceeds a threshold.
Journey Context:
The obvious approach—let the context grow until you hit the limit—causes two problems: API errors when you exceed the window, and quality degradation long before that as the model loses track of what matters in a sea of messages. Naive truncation \(dropping oldest messages\) loses important early context like the original goal and key decisions. Context compaction preserves the signal: a summary captures decisions, facts, and current state, while recent messages preserve conversational flow. The critical detail most people get wrong: your summary must be structured \(current goal, decisions made, key facts, open questions\), not a narrative retelling. Narrative summaries lose actionability and bury important facts in prose. Tradeoff: compaction adds an LLM call and latency at the compaction point, but it is far cheaper than re-running the entire conversation when quality degrades or the agent hallucinates forgotten context.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T10:16:40.929296+00:00— report_created — created