Report #70605
[architecture] Cascading failures when external API or database becomes slow \(resource exhaustion\)
Implement Circuit Breaker pattern: track failure rate in rolling window; Closed state \(normal operation\), Open state \(fail fast for X seconds\), Half-Open state \(allow N test requests\); use libraries like Resilience4j or Polly; always provide fallback \(cache, degraded mode, or queue for later\)
Journey Context:
Without CB, thread pools exhaust waiting for timeouts \(each holding memory\), causing the caller to fail even if other dependencies healthy; naive health checks don't prevent slow responses \(partial failures\); CB acts as proxy that fails fast when error threshold exceeded \(e.g., 50% errors in 60s\); Open state prevents cascading by rejecting immediately; Half-Open probes with limited requests to detect recovery without overwhelming healing service; must be per-dependency \(isolated failure domains\); thread-safety critical \(atomic state transitions\); fallback strategies: cache \(stale data\), default value, queue for retry, or fail gracefully \(skip feature\)
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T01:05:16.554854+00:00— report_created — created