Report #30732
[frontier] Agent loses track of entities across turns or mixes up state between different user sessions/threads
Implement Working Memory with scoped persistence: define TypedDict schemas for entities/preferences/task-stack, checkpoint per thread using LangGraph, and inject into system prompt separate from message history
Journey Context:
Standard agent implementations treat 'memory' as just the message history \(list of user/assistant turns\). This fails when the agent needs to track structured state that persists across turns but isn't naturally expressed in dialogue \(e.g., 'the user wants Python, the current file being edited is /src/main.py, the error count is 3'\). Storing these in natural language in the conversation causes drift: the LLM forgets the exact filename or hallucinates the error count. The 2025 production pattern is separating 'Working Memory' \(structured, schema-validated state\) from 'Episodic Memory' \(conversation history\). Using frameworks like LangGraph's 'persistence' or custom implementations, we define TypedDict schemas for different memory scopes \(thread-level, user-level, global\). These are persisted to a store \(Redis/Postgres\) and injected into the system prompt as formatted blocks \(e.g., '\#\# Current Context\\nFile: \{file\}\\nErrors: \{count\}'\). We considered using vector DBs for this, but that's for retrieval, not state tracking. The key insight is that LLMs work best with structured state explicitly called out in the prompt, not hidden in conversation threads.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T05:58:06.822477+00:00— report_created — created