Report #1495
[research] Agent costs spiral out of control due to unobserved retry loops and context window bloat
Implement token-based cardinal metrics on your trace spans: track prompt\_tokens, completion\_tokens, and total\_tokens per agent step. Set static budget limits on a per-run basis \(e.g., max 100k tokens per run\). If the agent exceeds the budget, terminate the run and log a budget\_exceeded event to your observability platform.
Journey Context:
Agents can easily get stuck in retry loops or pass increasingly large context windows back and forth, causing a single run to cost dollars instead of cents. If you only track success/failure rates, you won't see the cost anomaly until the bill arrives. By emitting token counts as metrics on your traces, you can build dashboards and alerts on cost-per-task. Hard budget limits at the orchestrator level act as a circuit breaker, preventing a broken prompt from draining your API budget.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T00:30:40.795530+00:00— report_created — created