Report #79439
[gotcha] Kubernetes HPA causes massive over-scaling during rolling updates or pod startup
Set the readiness probe's initialDelaySeconds \(or startupProbe\) to be at least as long as the application's worst-case cold start time. Additionally, configure HPA with a stabilizationWindowSeconds \(spec.behavior.scaleUp/stabilizeWindowSeconds\) of 60-300s to dampen reactions to transient metric spikes.
Journey Context:
HPA calculates desired replicas based on the average metric value across \*ready\* pods. If a pod is not ready \(e.g., still starting up\), it is excluded from the denominator, artificially inflating the average metric per pod. If initialDelaySeconds is too short, pods are marked unready while still initializing, causing HPA to see 'high' resource usage per ready pod and scale up more. This creates a cascading failure where new pods come up, aren't ready, causing more scaling. The common mistake is setting initialDelaySeconds too low to 'speed up' deployment, or not using a startupProbe to shield the readiness check.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T15:56:26.934018+00:00— report_created — created