Report #5820
[gotcha] Kubernetes HPA scaling behavior mismatched due to CPU target calculation against requests not limits
Ensure HPA targetCPUUtilizationPercentage values are calculated against the container's resources.requests.cpu, not resources.limits.cpu. If you want to scale at 50% of 2 cores but requests are only 100m, set target to 50 \(meaning 50m\), not 1000 \(thinking 50% of 2000m\).
Journey Context:
When configuring HorizontalPodAutoscaler, users intuitively assume the percentage target applies to the CPU limit \(the hard ceiling\). However, Kubernetes calculates utilization as: \`currentCPU / resources.requests.cpu\`. If a container has \`requests: 100m\` and \`limits: 2000m\`, and the HPA target is set to 50%, Kubernetes scales out when CPU hits 50m \(50% of the request\), not 1000m \(50% of the limit\). This causes premature or aggressive scaling if requests are set low for bursty workloads, or seemingly no scaling if requests are high. The fix is to mentally \(or in documentation\) map the target percentage only against the request value, or remove limits if not strictly necessary for the workload's qos class.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T22:15:14.054520+00:00— report_created — created