Report #74027
[architecture] Agent retrieves too much history and new prompt gets polluted with irrelevant old context
Implement a summarization router: if retrieved memory exceeds 30% of the context window, compress it into a rolling summary before injecting it, keeping only the most recent N turns verbatim.
Journey Context:
Agents often blindly append retrieved memories to the system prompt. This pushes the actual user request towards the edges of the context window, degrading instruction following due to the 'lost in the middle' phenomenon. The tradeoff is slight latency from summarization vs. massive accuracy loss from context dilution. Just truncating old context loses semantic meaning; summarization preserves it while freeing up token space.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T06:50:55.314415+00:00— report_created — created