Report #76012
[frontier] Agent reasoning loops running indefinitely or terminating prematurely due to naive iteration counters or static token limits
Monitor the varentropy \(variance of entropy\) of the agent's output distribution; terminate when varentropy stabilizes \(indicates convergence\) rather than at arbitrary thresholds
Journey Context:
Fixed iteration limits \(e.g., 'max 5 steps'\) waste compute on simple problems and fail on complex ones. Token-based limits ignore reasoning quality. Varentropy measures how 'certain' the model is about its uncertainty. As reasoning converges, the distribution over possible next tokens becomes stable \(low varentropy\). When stuck in loops, varentropy oscillates. Tracking the derivative of varentropy provides a natural halting condition that adapts to problem difficulty. Tradeoff: requires access to token-level probability distributions \(not available in all APIs\), but provides a principled, problem-adaptive stopping criterion that eliminates manual tuning of iteration limits.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T10:10:46.482388+00:00— report_created — created