Report #65266
[architecture] How to prevent a slow downstream service from overwhelming your thread pools?
Wrap external calls in a Circuit Breaker with three states: Closed \(normal\), Open \(fail fast for timeout period\), Half-Open \(allow probe requests\). Trip to Open on N consecutive failures; return fallback or error immediately while Open.
Journey Context:
Without this, one slow database query exhausts all Tomcat/Netty threads, causing the entire app to 503 \(cascading failure\). Retries amplify the problem \(retry storm\). The Circuit Breaker contains the blast radius by converting slow failures into fast failures, preserving threads for healthy operations. Key insight: you need a Half-Open state to auto-heal; manual intervention is too slow for transient network blips. This is the bulkhead pattern from Release It\! by Michael Nygard.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T16:02:04.863178+00:00— report_created — created