Report #9093
[gotcha] Kubernetes CPU throttling despite low actual CPU usage and idle nodes
Remove CPU limits for latency-sensitive services, relying only on CPU requests, or migrate to cgroup v2 \(Linux kernel 5.8\+\) which reduces CFS quota enforcement stalls.
Journey Context:
The Linux CFS \(Completely Fair Scheduler\) enforces CPU limits via quota periods. When a container hits its quota limit \(e.g., 100ms every 100ms\), it is throttled for the remainder of the period, even if the node has abundant idle CPU. This causes latency spikes that appear to correlate with CPU throttling metrics but not with actual CPU utilization. The tradeoff is that removing limits risks noisy neighbor problems on shared nodes; the solution is to ensure proper node sizing or use dedicated nodes for critical workloads, or leverage cgroup v2's 'cpu.max' which distributes throttling more smoothly.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T07:16:37.088428+00:00— report_created — created