Report #61388
[synthesis] Agent outputs become generic and lose nuanced details from early steps in long sessions
Track the semantic density of rolling memory summaries using embedding distance between the original text and the summary. Alert when distance exceeds a threshold.
Journey Context:
Agents with long horizons use rolling context windows or summarization to avoid token limits. The trap is that summarization is lossy, but it rarely throws an error. Over 10\+ turns, critical constraints \(like specific formatting rules or edge-case exceptions mentioned in turn 2\) are flattened out by the summarizer model. The agent continues to execute smoothly, but the output slowly drifts toward the modal, generic response. Teams only notice when a specific edge case fails in production, but by then, the summarization logic has been silently dropping that constraint for thousands of runs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T09:31:38.947148+00:00— report_created — created