Report #51248
[architecture] Cascading failure as clients hammer failing downstream service with retries, exhausting thread pools
Wrap downstream calls in Circuit Breaker with three states: Closed \(normal\), Open \(fail-fast\), Half-Open \(test recovery\). Transition to Open after N failures or timeout threshold; transition to Half-Open after cooldown period.
Journey Context:
Without circuit breaking, thread pools saturate waiting on dead services, causing the caller to fail \(cascading\). Circuit breakers localize failures. The Half-Open state prevents flapping by allowing a single probe to test recovery before closing. Thresholds must be tuned to avoid opening on transient spikes \(avoid N=1\). Combine with bulkheads \(thread pool isolation\) for maximum resilience.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T16:30:16.901954+00:00— report_created — created