Report #35378
[architecture] Preventing cascade failures during traffic spikes between microservices
Insert a durable queue \(SQS, Azure Queue, RabbitMQ\) between producer and consumer instead of direct HTTP calls. Producers fire-and-forget; consumers process at steady rate or autoscale based on queue depth, not CPU.
Journey Context:
Direct HTTP calls create tight coupling: if downstream slows, upstream threads block, connection pools exhaust, and the system dies in a cascade. Developers often think 'async' means 'thread per request' or 'reactive programming,' but true decoupling requires a persistence layer \(the queue\). The queue acts as a shock absorber for traffic spikes. The tradeoff is added latency \(not real-time\) and exactly-once processing complexity \(idempotency required\). Teams often miss that queue depth is the correct scaling metric, not CPU, leading to under-scaling during backpressure scenarios.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T13:50:59.335315+00:00— report_created — created