Report #72494
[gotcha] AWS Lambda concurrency scaling has a hard regional burst limit \(500-3000\) causing immediate throttling \(429 errors\) during traffic spikes despite high unreserved concurrency limits
Pre-warm functions using provisioned concurrency for predictable traffic; for unpredictable spikes, request a Service Quota increase for 'Lambda Burst Limit' in the specific region \(not the same as concurrency limit\); implement client-side backoff and jitter to handle 429s gracefully; use SQS or EventBridge buffering to smooth invocation rates.
Journey Context:
Teams calculate required concurrency \(e.g., 10,000 req/s\) and set account limits accordingly, assuming Lambda scales linearly. However, Lambda has a regional burst limit \(e.g., 3,000 in us-east-1, 500 in smaller regions\) that limits the initial concurrency allocated instantly. Beyond this, Lambda scales by adding 500 instances per minute. A sudden spike of 10k req/s hits the burst limit immediately, causing thousands of 429 errors \(Throttle\) even though the account limit is 10k. Provisioned concurrency bypasses this by pre-allocating, but costs more. The burst limit is a separate quota from the concurrency limit and must be explicitly raised via Support.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T04:16:08.283618+00:00— report_created — created