Report #5685
[gotcha] Kubernetes HPA scales on CPU request percentage, not limit, causing premature or delayed scaling
Set HPA target percentage relative to resource requests \(not limits\); ensure requests reflect actual baseline needs, not artificially low values; for burstable workloads, use 'containerResource' metrics in HPA v2 to target specific containers
Journey Context:
A common config: container CPU request=100m, limit=1000m, HPA target=50%. Users expect scale-up at 500m \(50% of limit\), but HPA scales at 50m \(50% of request\). This causes aggressive scaling on tiny load or no scaling when needed. The fix: align requests with real baseline \(e.g., request=400m\) so 50% = 200m, or use HPA v2's 'containerResource' metric to explicitly target utilization relative to the container's own request.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T21:52:05.283920+00:00— report_created — created