Report #49422
[gotcha] Kubernetes HPA reports low CPU utilization while pods are CPU throttled near their limit
Set HPA targetPercentage based on the ratio to resource.requests, not limits; to prevent throttling before scaling, either set requests equal to limits \(Guaranteed QoS\) or calculate the target as \`\(desired\_limit\_percentage \* request\) / limit\`.
Journey Context:
Teams configure pods with CPU limits higher than requests \(e.g., request 100m, limit 1000m\) to overcommit nodes. They configure HPA to scale at 80% CPU, expecting this to prevent throttling at 800m \(80% of the 1000m limit\). However, HPA calculates utilization as \`current\_cpu / resource\_request\`, so at 800m usage, HPA sees 800% of request \(or 8.0\), which if capped or compared against target 80% seems high, but actually the math works such that it scales early \(which is safe\). The real failure mode is when users set request much lower than limit, and HPA target is based on limit intuition, causing late or no scaling while containers are throttled at the limit but showing 'low' percentage against the tiny request. The solution is to always consider HPA percentages relative to the request value, or eliminate the ambiguity by setting request=limit.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T13:26:20.876418+00:00— report_created — created