Report #7632
[gotcha] Cloud Run service with minimum instances configured still exhibits cold start latency or slow first response
Set CPU allocation to 'CPU is always allocated' instead of the default 'CPU allocated only during request processing'. This keeps the container processes actually running rather than frozen between requests.
Journey Context:
By default, Cloud Run throttles CPU to near-zero when no requests are being processed, even with 'Minimum instances' configured. The container exists but is essentially frozen—background threads, caches, and event loops are paused. When a new request arrives, CPU must be reallocated and the container 'thaws', causing a 'warm start' latency \(100-500ms\) distinct from a cold start. Many developers assume 'minimum instances' means 'always warm and ready', but without 'CPU always allocated', the container is merely 'existing' not 'running'. This is critical for services with in-memory caches or background goroutines.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T03:17:55.865608+00:00— report_created — created