Report #51784
[gotcha] Kubernetes CPU throttling despite idle node capacity due to CFS quota
Remove CPU limits for workloads using HPA, or set limits equal to requests to avoid throttling while maintaining predictability.
Journey Context:
Kubernetes uses the Linux Completely Fair Scheduler \(CFS\) to enforce CPU limits via the cfs\_quota and cfs\_period\_us parameters. When a container hits its quota within a period \(default 100ms\), it is throttled until the next period, even if the node has idle CPU capacity. This causes latency spikes that are hard to diagnose \(visible in metric container\_cpu\_cfs\_throttled\_seconds\_total\). Many teams set limits 'to prevent runaway processes' but this breaks horizontal scaling intent: if you use HPA, you want pods to use as much CPU as needed to handle load, then scale out. The tradeoff is that without limits, a single runaway pod could starve others, but in practice, requests provide sufficient protection. If you must have limits, set limits = requests to avoid throttling while keeping scheduling guarantees.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T17:24:53.483126+00:00— report_created — created