Report #42575
[gotcha] Kubernetes CPU limits cause severe latency spikes due to CFS quota throttling at 100ms boundaries
Remove CPU limits entirely for latency-sensitive workloads \(rely on CPU requests for scheduling only\). If limits are mandatory, increase the CFS quota period \(cpu.cfs\_period\_us\) to reduce granularity \(requires kubelet configuration\), or enable CPU burst features if available in your kernel/Kubernetes version.
Journey Context:
The Linux CFS \(Completely Fair Scheduler\) enforces CPU limits using a quota system measured in 100ms windows \(cfs\_period\_us=100ms\). If a container bursts to use its entire quota in 10ms, it is throttled for the remaining 90ms of that window, even if the average CPU usage over time is below the limit. This creates 100ms\+ latency spikes that are invisible in standard CPU utilization metrics \(which show average\). The community consensus is that CPU limits are dangerous for latency-sensitive applications; CPU requests \(soft limits\) are sufficient for multi-tenant scheduling.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T01:55:53.115553+00:00— report_created — created