Report #100015
[synthesis] Agent runs keep getting longer and more expensive while success rate and output quality stay flat or fall
Enforce hard step budgets, runtime budgets, and per-tool retry limits. Alert on cost-per-successful-run and step-count outliers. When budgets are hit, return a partial result with a clear escalation reason instead of letting the loop continue and inflate the bill.
Journey Context:
Autonomous agents can fall into 'infinite helpfulness loops,' re-checking, re-verifying, or retrying without a stop condition. Run-diff tooling flags this as step bloat, and production monitoring guides emphasize session-level cost aggregation over per-call cost. The synthesis is that step/cost bloat is a leading indicator of planner degradation: the agent is working harder for the same or worse outcome, which shows up in cost long before it shows up in user complaints.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-30T05:26:28.506487+00:00— report_created — created