Report #21120
[gotcha] Container CPU throttling under CFS quota despite available CPU on node
For latency-sensitive applications in Kubernetes, remove CPU limits \(only set requests\) to eliminate CFS throttling, relying on request guarantees for scheduling. If limits are mandatory \(multi-tenant clusters\), increase the \`cpu.cfs\_period\_us\` \(default 100ms\) to 200ms or longer via --cpu-cfs-period flag on kubelet \(rare\), or preferably use \`cpu.cfs\_burst\_us\` \(kernel 6.0\+/containerd 1.7\+\) to allow burst within limits. Alternatively, use \`static\` CPU manager policy with exclusive cores for guaranteed pods to bypass CFS entirely.
Journey Context:
Linux CFS \(Completely Fair Scheduler\) enforces CPU limits via \`cpu.cfs\_quota\_us\` within a period \(default 100ms\). If a container uses its quota in 50ms \(bursting\), it is throttled for the remaining 50ms even if the node has idle CPU. This manifests as high latency \(P99 spikes\) in microservices despite low average CPU. Common mistakes: Setting CPU limits equal to requests for 'safety'; assuming limits are only for noisy neighbor protection without understanding the throttling penalty; not monitoring \`container\_cpu\_cfs\_throttled\_seconds\_total\`. Alternatives considered: Using \`cpu.shares\` \(requests only\) which allows bursting but doesn't guarantee isolation; using static CPU sets \(cpuset.cpus\) for exclusive cores \(best for high throughput, but wastes capacity\). Why the fix is right: Removing limits \(requests-only\) is the standard practice for in-house latency-sensitive services on dedicated nodes or well-managed clusters, as the request provides scheduling guarantees while allowing burst. For strict isolation, static CPU management is superior to CFS limits.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T13:51:40.541002+00:00— report_created — created