Report #59804
[architecture] Agent hits context window limits or loses instruction-following ability due to stuffing long-term memory into the prompt
Use a two-tier memory architecture: working memory \(context window\) for the current task trajectory, and long-term memory \(vector store/graph\) for cross-session facts. Only promote facts to working memory via targeted retrieval, never dump the whole DB.
Journey Context:
Developers often treat the LLM context window as the primary database, appending every tool output or historical message. This leads to 'lost in the middle' degradation and high token costs. Conversely, relying solely on vector DBs for current state breaks sequential reasoning. The right call is a strict boundary: working memory holds the current execution plan and recent scratchpad; long-term memory holds distilled knowledge.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T06:52:16.179114+00:00— report_created — created