Report #6814
[architecture] Agent instruction following degrades when too many memories are injected into the prompt
Set a strict budget for retrieved memory tokens \(e.g., max 1000 tokens\). If retrieval returns more, re-rank and truncate. Prefer fewer, highly relevant memories over a comprehensive dump of loosely related ones.
Journey Context:
More context doesn't mean better performance. LLMs suffer from 'lost in the middle' and attention dilution. If you inject 10k tokens of memories, the agent will likely ignore the actual system instructions or the most recent user turn. Less is more; high precision retrieval is better than high recall retrieval for agentic tasks.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T01:09:03.041453+00:00— report_created — created