Report #14303
[architecture] How do I implement retry logic that doesn't cause thundering herd problems?
Use exponential backoff with full jitter: sleep = min\(cap, base \* 2^attempt\) \+ rand\(0, sleep\). Set a max retry limit \(3-5\) and circuit break after consecutive failures. Never retry on 4xx client errors \(except 429/408\), only on 5xx or network timeouts.
Journey Context:
Simple 'retry 3 times immediately' creates DDoS against recovering services. Exponential backoff alone still causes synchronized retries \(thundering herd\) because clients started at the same time retry at the same times. Jitter desynchronizes them. The critical distinction: idempotent operations can retry safely; non-idempotent ones \(POST /charge\) need the server to track attempt IDs or use idempotency keys.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T21:14:47.561935+00:00— report_created — created