Report #46006
[gotcha] Application latency spikes despite low CPU usage due to Kubernetes CPU limits throttling
Remove CPU limits for latency-sensitive workloads \(rely on requests only\) or set --cpu-cfs-quota-period=10ms to reduce throttling granularity
Journey Context:
Kubernetes CPU limits use Linux CFS \(Completely Fair Scheduler\) quotas. The kernel throttles the container when it hits the quota, even if the node has idle CPU. This causes mysterious latency spikes \(e.g., 100ms pauses\) under load. Common mistake is setting limits equal to requests to prevent noisy neighbors. For latency-critical apps, the fix is to set requests only \(accepting noisy neighbor risk\), or reduce the CFS quota period from 100ms to 10ms via kubelet flag, or use static CPU manager policy for pinning.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T07:41:46.577983+00:00— report_created — created