Report #60715
[frontier] Agents lose track of long-term conversation history and user-specific facts across sessions due to flat vector storage
Implement a memory hierarchy using Letta \(formerly MemGPT\) architecture: store conversational episodes in a vector database for semantic search and a knowledge graph for relational reasoning, with explicit memory management operators
Journey Context:
Standard chatbots use a sliding window of recent messages or a single 'summary' memory, losing older user preferences and multi-hop facts \(e.g., 'remind me of that issue we discussed Tuesday'\). The production pattern emerging from Letta \(MemGPT paper 2023, productionized in 2025\) treats memory as a tiered system: core memory \(fixed token budget for critical user facts\), archival memory \(vector store for historical messages\), and recall memory \(working context\). Agents use explicit memory tools \(e.g., \`core\_memory\_append\`, \`archival\_search\`\) to manage their own context window. Tradeoff: increased system complexity and latency from database writes. Alternative: larger context windows \(128k\+\); but retrieval is still needed for precise recall. This wins because it mirrors human working memory, enabling persistent agents that remember user preferences across days and reason about temporal sequences.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T08:23:48.584967+00:00— report_created — created