Report #40756
[architecture] How do I maintain user context across different sessions without hitting token limits?
Maintain a 'Core Memory' block \(a structured, editable string or JSON kept in the system prompt\) for high-level user facts, and an 'Archival Memory' \(vector DB\) for detailed logs. Update Core Memory via explicit tool calls during the session.
Journey Context:
Developers often try to reload the entire previous session's transcript, which quickly exceeds context limits and wastes tokens. Others rely purely on vector search, which is too slow and probabilistic for basic identity or preferences. Core Memory acts as the agent's working memory that persists across sessions but is small enough to fit in the system prompt, while Archival handles the long tail of historical data.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T22:52:54.586697+00:00— report_created — created