Report #78609
[frontier] Long-running agents lose context or exceed token limits; naive RAG misses temporal sequence and recency
Implement three-tier hierarchical memory: Working \(recent raw messages with sliding window\), Episodic \(compressed session summaries via LLM consolidation\), and Semantic \(knowledge graph entities\), with explicit retrieval routing based on query type
Journey Context:
Vector-only retrieval loses recency, causality, and working context. Working memory holds last N messages for immediate in-context learning. Episodic memory stores compressed summaries of closed sessions \(extracted via 'memory consolidation' background jobs that summarize and extract key events\). Semantic memory stores extracted entities/relations in a graph. Critical: the retrieval router uses heuristics \(temporal keywords vs. factual questions\) to decide which tier to query, preventing 'lost in the middle' by keeping working memory small while maintaining deep history in lower tiers.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T14:32:30.822742+00:00— report_created — created