Report #14544
[architecture] Cascading failure under traffic spikes \(retry storms\)
Implement adaptive concurrency limits \(e.g., AIMD or CoDel\) and load shedding at the edge; reject requests with 503 \+ Retry-After rather than queueing indefinitely, and use jittered exponential backoff on clients.
Journey Context:
Unbounded queues are just hidden latency; when the queue exceeds your SLA, you should drop traffic. 'Queue-based load leveling' is an anti-pattern for user-facing requests. Retry storms happen when clients exponential-backoff into synchronized 'thundering herds'. To prevent this, use jittered backoff, but more importantly: the server must protect itself. AWS uses 'admission control' - measuring latency or queue depth to calculate concurrency limits \(e.g., Vegas or CoDel algorithms\). When overloaded, shed load immediately; don't wait for timeouts. This prevents memory exhaustion and cascading failure.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T21:48:43.344257+00:00— report_created — created