Report #51540
[architecture] Agent context window is maxed out or hallucinating because old conversation history is injected directly into the prompt
Implement a two-tier memory architecture: working memory \(context window\) for the current task, and long-term memory \(vector store\) for historical facts. Summarize older turns before they leave the context window, rather than truncating or blindly retrieving.
Journey Context:
Developers often dump retrieved chunks directly into the context, pushing out the actual system prompt or current task instructions. Context windows are for active reasoning; vector stores are for archival. Blind retrieval without relevance scoring causes the 'lost in the middle' effect where the LLM ignores the injected context anyway.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T17:00:03.904039+00:00— report_created — created