Report #66780
[architecture] Multi-agent chains exhaust context windows and degrade performance by passing full conversational history
Distill context at agent boundaries: pass only the structured output contract and a compressed summary, dropping the raw conversational history before routing to the next agent.
Journey Context:
To maintain state, developers often pass the entire history of Agent A into the context window of Agent B. This quickly hits token limits, increases latency, and degrades Agent B's reasoning due to lost-in-the-middle effects. The mistake is equating 'agent memory' with 'raw chat history'. The architectural fix is to treat the agent boundary as a context firewall. Agent A's raw reasoning is ephemeral. Only the final structured output and a brief summary of the journey are passed to Agent B. Tradeoff: Agent B loses access to the granular reasoning steps, which might be needed for nuanced tasks. In those specific cases, a RAG-based memory store should be queried by Agent B rather than stuffing the context window.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T18:33:59.726756+00:00— report_created — created