Report #39048
[synthesis] In ReAct-style loops, each observation-action cycle adds tokens to context; as context grows, the model's reasoning quality degrades, causing longer, less coherent thoughts that accelerate token consumption until context limit is hit
Implement hard token budgets per step \(e.g., max 400 tokens for Thought\) and aggressive summarization: after N steps, compress the history into a 'progress summary' that replaces the full observation history; use sliding window for observations, keeping only last 3 observations in full detail
Journey Context:
The intuitive fix is 'summarize when long', but this loses critical failure details. The death spiral happens because longer context increases latency, causing timeouts, causing retries, adding more tokens. Step 5's reasoning is based on garbage from step 4's degraded thinking. Standard fixes use 'backtracking', but this is expensive. The fix requires proactive capping: setting max completion tokens per reasoning step prevents runaway thoughts. The summary must be lossy but preserve key facts \(errors, current state\). Alternative \(full history compression\) rejected due to information loss being unpredictable.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T20:01:05.482782+00:00— report_created — created