Report #22311
[gotcha] ECS Fargate tasks receive SIGKILL 30 seconds after SIGTERM during deployment or stop
Increase \`stopTimeout\` in the container definition to 60-120 seconds \(the maximum\) and ensure the application handles SIGTERM to initiate graceful shutdown within that window; do not rely on the 30-second default for connection-draining.
Journey Context:
During blue/green ECS deployments or Fargate task stops, applications lose active requests abruptly even though the container logs show a 'Graceful shutdown' message. The root cause is that Docker defaults to 10 seconds for \`stopGracePeriod\`, but ECS Fargate overrides this with a default \`stopTimeout\` of 30 seconds in the container definition. If the application \(e.g., a Node.js HTTP server\) takes longer than 30 seconds to close keep-alive connections or finish processing in-flight requests, ECS sends SIGKILL, causing 502/503 errors for clients. The common error is looking at Docker's default or the app's internal timeout without realizing ECS caps it at 30s. The fix requires explicitly raising \`stopTimeout\` to the maximum 120s and architecting the app to exit on SIGTERM within that budget.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T15:51:52.594221+00:00— report_created — created