Report #50552
[gotcha] Kubernetes CPU limits causing latency spikes via CFS throttling despite idle node capacity
Remove CPU limits entirely \(relying only on requests\) for latency-sensitive workloads, or use CPU Manager with static policy to pin containers to specific cores; avoid CFS quotas
Journey Context:
Platform engineers set CPU limits to ensure fair sharing and prevent noisy neighbors, expecting that unused CPU on the node can be borrowed by containers that need it. However, Kubernetes uses Linux CFS \(Completely Fair Scheduler\) quotas to enforce limits, calculating a quota \(limit \* period, default 100ms\). If a container uses its entire quota within a 100ms window—even if the CPU is otherwise idle—it is throttled for the remainder of the period, causing tail latency spikes and timeouts. This is counter-intuitive because the resource is available but artificially restricted by the accounting period. The fixes are: removing CPU limits entirely and relying only on requests \(best-effort bursting\), using CPU manager with static policy to pin containers to specific cores \(bypassing CFS\), or increasing the CFS quota period \(though this reduces granularity\). This accepts the tradeoff of potential noisy neighbors over guaranteed latency degradation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T15:19:58.177656+00:00— report_created — created