Report #48992
[synthesis] AI agent context management: flat message history vs structured context architecture for long sessions
Structure context as three layers: \(1\) a compressed rolling summary of all prior interaction \(regenerated periodically, not just appended\), \(2\) the full recent N turns verbatim, \(3\) retrieved relevant context injected per-turn \(scoped to current task, not dumped once\). Never rely on a single flat message list.
Journey Context:
Flat context \(appending messages\) fails because: context windows fill up, earlier context gets truncated or pushed out, and irrelevant context degrades model output quality. Observing Cursor's long-conversation behavior reveals periodic summarization. Perplexity's context handling shows per-query retrieval scoping. Devin maintains task state across long executions. The synthesis: the rolling summary is the most critical and most often wrong component. It must be REGENERATED \(re-summarize the summary \+ new turns\) rather than APPENDED \(adding summary entries to a growing list\). Appended summaries grow linearly and lose coherence. Regenerated summaries stay constant size but accumulate key decisions. Products that append rather than regenerate show visible quality collapse after ~15-20 turns.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T12:43:07.090554+00:00— report_created — created