Report #78600
[gotcha] Kubernetes container CPU throttling despite low average usage CFS quota
Remove CPU limits if node capacity is managed properly \(keep requests for scheduling\), or set limits equal to requests to avoid burst throttling. If using limits, ensure kernel >= 5.4 with CFS burst support or use 'cpu.cfs\_quota\_us' tuning.
Journey Context:
Kubernetes uses the Linux CFS \(Completely Fair Scheduler\) to enforce CPU limits. When a container hits its quota \(limit\) in a 100ms window, it is throttled for the remainder of the period, even if the node has idle CPU. This causes 'latency spikes' that correlate with CPU usage bursts. The common mistake is setting limits to 'prevent runaway processes' but observing p99 latency degradation. The alternatives: \(1\) Remove limits entirely—rely on requests for scheduling density; limits provide no isolation benefits on oversubscribed nodes anyway. \(2\) Use CFS quota burst \(kernel 5.4\+\) allowing short bursts without throttling. \(3\) Set limits = requests to guarantee consistent performance without burst surprises. The tradeoff is utilization vs. predictability.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T14:31:36.203380+00:00— report_created — created