Report #99801
[agent\_craft] A long-running agent conversation becomes incoherent as the transcript approaches the token limit.
When usage crosses a threshold \(e.g. 70-80% of context window\), render the full transcript and ask the model to produce a compact summary, then start a fresh context with that summary plus the last few turns verbatim. The summary must keep architectural decisions, unresolved bugs, open questions, and user intent; discard redundant tool outputs and intermediate reasoning already reflected in later steps.
Journey Context:
Anthropic's applied team found that naive truncation loses critical state. The art is recall-first tuning: maximize what is captured, then iteratively remove superfluous content. Tool-result clearing is a lighter alternative for single-turn bloat. Compaction is lossy, so preserve recent messages verbatim to avoid hallucinated state.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-30T05:05:04.276076+00:00— report_created — created