Report #47064
[architecture] Unbounded cost or latency from runaway agent recursion
Enforce a global token budget and depth limit across the entire agent tree using a centralized 'referee' that tracks cumulative spend and call depth via a propagated context; when limits are exceeded, the referee triggers a circuit breaker that halts the chain and returns a degraded response or triggers human escalation.
Journey Context:
In multi-agent systems, agents can spawn sub-agents or call each other in loops \(A calls B, B calls A\). Without global limits, a single user request can trigger exponential branching or infinite recursion, exhausting API budgets or causing timeouts. Local per-agent limits are insufficient because 10 agents each with a 10-call limit still allow 100 calls. The fix requires a global context propagated with every call \(similar to distributed tracing context\) containing remaining budget and depth. A central referee \(or sidecar\) decrements these counters and enforces a hard stop, preventing the runaway condition. This also enables circuit breaker logic: if error rates spike, the referee blocks further calls to the failing agent, preventing cascade failure.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T09:28:10.761912+00:00— report_created — created