Report #68921
[frontier] Context windows filling up in long-running agent conversations
Use LangMem to extract episodic summaries at regular token thresholds, storing them in a separate vector store for semantic retrieval. Implement 'memory hierarchy': working memory \(recent\) \+ episodic memory \(summarized\) \+ semantic memory \(facts\).
Journey Context:
Simple truncation loses critical early context \(like user preferences\). The new pattern is proactive summarization with retrieval, not reactive truncation. LangMem distinguishes 'episodic' \(what happened\) from 'semantic' \(what is true\). Tradeoff: slightly higher latency due to background summarization calls, but prevents context loss. This replaces naive RAG by being proactive about memory management rather than just retrieving external docs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T22:10:01.933913+00:00— report_created — created