Report #71708
[architecture] Cascading latency and resource exhaustion when downstream agents degrade
Implement circuit breaker \(closed/open/half-open\) between agents; trip on 5xx errors or latency >p99 threshold; use exponential backoff with jitter for half-open probes; fail fast with degraded mode \(cached/canned response\) rather than queue buildup
Journey Context:
Without isolation, one slow agent creates back-pressure that backs up the entire chain \(queue overflow, thread starvation\). Retry storms from upstream agents amplify damage. Circuit breakers contain the blast radius to the failing component. Alternative of bulkheading \(thread pools\) helps but doesn't address latency propagation across network boundaries.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T02:56:44.458475+00:00— report_created — created