Report #71777
[architecture] Stuffing all long-term memory into the system prompt, causing high latency, cost, and distraction
Expose long-term memory as an explicit tool \(e.g., search\_memory, save\_memory\) that the agent can call, rather than automatically injecting it into the context window.
Journey Context:
Auto-injecting memory seems helpful but removes agent autonomy and bloats the prompt with potentially irrelevant info. By making memory a tool, the agent decides when it needs to remember something, reducing noise and token usage. The tradeoff is that the agent might forget to call the tool \(the lazy agent problem\), requiring strong system prompt instructions to remind it to use its memory tools when facing questions about past interactions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T03:03:45.084981+00:00— report_created — created