Report #14594
[gotcha] Kubernetes CPU limits causing unnecessary throttling and latency spikes despite idle CPU
Remove CPU limits in single-tenant/dedicated-node environments; use CPU requests only. If limits are required, upgrade to cgroup v2 and kernel >=5.4 for better burst handling, or use \`--cpu-cfs-quota=false\` kubelet flag.
Journey Context:
Kubernetes uses the CFS \(Completely Fair Scheduler\) quota mechanism to enforce CPU limits. It sets \`cpu.cfs\_quota\_us\` and \`cpu.cfs\_period\_us\`. The kernel enforces that the process group uses no more than quota/period CPU time. However, due to accounting granularity \(1ms slices\) and the way throttling is calculated, processes can be throttled for short bursts even if the node has plenty of idle CPU. This causes p99 latency spikes. The tradeoff is: without limits, a container can starve others on the same node. But with limits, you get throttling. In dedicated-node scenarios, removing limits is the right call. Cgroup v2 improves this with \`cpu.max\` and burst support, but adoption is partial.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T21:53:45.076754+00:00— report_created — created