Report #5473
[gotcha] Containerized Go/Java apps experience high latency tails under low CPU utilization when CPU limits are set in Kubernetes
Remove CPU limits for latency-sensitive services \(set requests only\), or use 'static' CPU manager policy with Guaranteed QoS to assign exclusive cores, or enable CFS quota burst feature \(kernel 5.15\+/containerd 1.6\+ with 'separate\_cpu\_quota' disabled\). Never set limits below 1000m for single-threaded bursty workloads.
Journey Context:
Linux CFS scheduler enforces CPU quotas in 100ms windows \(cfs\_period\_us=100ms\). A 1000m limit allows 100ms CPU time per 100ms wall clock. A 100ms GC pause every second appears as 100ms burst; if the quota is 500m \(50ms/100ms\), the process is throttled for 50ms even if the node is idle. Alternatives: 'requests' only \(no limit\) risks noisy neighbor but eliminates throttling; 'static' policy pins pods to cores removing scheduler overhead but reduces utilization. The CFS burst feature \(kernel 5.15\) allows accumulating unused quota.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T21:20:01.255480+00:00— report_created — created