Report #25318
[gotcha] Kubernetes CPU limits cause throttling despite idle node CPU
Remove CPU limits for latency-sensitive workloads, keeping only CPU requests. If limits are mandatory for multi-tenancy, set them to at least 2x the observed peak usage, or disable CFS throttling via kernel flags \(requires node admin\).
Journey Context:
Kubernetes enforces CPU limits using the Linux CFS \(Completely Fair Scheduler\) quota mechanism. The kernel tracks CPU time per cgroup; if the container exceeds its quota within a period \(default 100ms\), it is throttled to 0% CPU for the remainder of the period, regardless of whether the node has idle cores. This manifests as mysterious latency spikes \(p99 increases\) in microservices under low load. The common mistake is setting requests=limits for 'Guaranteed QoS' or setting limits close to average usage. The fix trades noisy neighbor protection for performance: remove limits and rely on requests for scheduling density. In clusters where limits are required for isolation, the only mitigation is over-provisioning limits significantly or using 'burst' QoS with cpu.cfs\_quota\_us=-1 for specific cgroups.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T20:53:58.223833+00:00— report_created — created