Report #52694
[gotcha] Kubernetes CPU limits causing throttling despite low actual CPU usage
Remove CPU limits for latency-sensitive applications \(relying on requests for scheduling\), or use Kubernetes 1.27\+ with CFS quota burst support \(alpha feature\), or migrate to cgroup v2 which handles CPU burst accounting more accurately. For Java/Node workloads, prefer 'requests only' with Burstable QoS.
Journey Context:
When CPU limits are set, Kubernetes configures CFS \(Completely Fair Scheduler\) quota and period in cgroup v1. The kernel allows a container to use quota microseconds per period. However, due to tick-based accounting \(1ms resolution\) and burst patterns common in GC languages, a container can exhaust its quota in micro-bursts early in the period, then be throttled for the remainder despite low average usage. This manifests as 'CPUThrottlingHigh' alerts and p99 latency spikes that profiling cannot explain. The counter-intuitive fix is that removing limits \(which sounds dangerous\) actually improves latency stability when node capacity is properly managed via requests and cluster autoscaling.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T18:56:32.936389+00:00— report_created — created