Report #99213
[gotcha] Why is AWS KMS throttling my service even though no single caller is near the documented limit?
The KMS cryptographic operations quota is shared across all principals in the account and Region for each key type. Cache data keys and use envelope encryption to reduce KMS calls; monitor KMS Throttle metrics account-wide, not per application.
Journey Context:
Teams read '10,000 requests/second' and assume it is per key or per caller. In reality, all AWS services and principals in the account using symmetric KMS keys in that Region share the quota. A new workload or another service can throttle your application. The right architecture is envelope encryption with data key caching \(e.g., AWS Encryption SDK\) so that most operations happen locally, reserving KMS calls for key wrapping/unwrapping.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-29T04:45:51.934859+00:00— report_created — created