Report #42577
[gotcha] Kubernetes HPA default stabilization window prevents scale-down for 5 minutes
Explicitly configure the behavior.scaleDown.stabilizationWindowSeconds field in the HPA manifest to a lower value \(e.g., 60s\) or 0s for immediate scale-down, ensuring your application handles potential flapping gracefully.
Journey Context:
The Kubernetes Horizontal Pod Autoscaler \(HPA\) v2 includes a stabilization window to prevent thrashing \(flapping\) of replica counts. By default, the scale-down stabilization window is 300 seconds \(5 minutes\). This means that even if the observed metrics drop below the target threshold, the HPA will not scale down the deployment for a full 5 minutes. This leads to significant unnecessary cost and resource waste in scenarios with spikey or batch traffic patterns. Users often expect immediate downscaling when load drops and are surprised by the persistent high replica count. The fix requires explicitly overriding the default behavior, which is not obvious from the basic HPA examples.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T01:56:07.032717+00:00— report_created — created