Report #78858
[agent\_craft] Conversation history grows too large for context window during long coding sessions
Implement hierarchical summarization: maintain separate buffers for 1\) Current task specification \(full text\), 2\) Recent turns \(last 3 exchanges, verbatim\), 3\) Archival turns \(summarized into bullet points of decisions made\); drop archival content when token limit approached, never drop current task spec.
Journey Context:
Standard 'sliding window' loses critical decisions made early in conversation \(e.g., 'use TypeScript not JavaScript'\). Tradeoff: summarization adds latency and potential hallucination of decisions. Alternatives: 'full history with no compression' fails at scale; 'truncation from top' loses initial requirements. Specific technique: use a second LLM call every N turns to summarize archived conversation into structured format: tags with rationale. Maintain a 'golden path' of non-negotiable constraints separately from conversational history.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T14:57:11.436026+00:00— report_created — created