Report #60828
[bug\_fix] Production deployment cancelled mid-flight when subsequent commit pushed to main due to overly broad concurrency group
Use distinct concurrency groups for production \(e.g., group: deploy-prod\) with cancel-in-progress: false, while keeping cancel-in-progress: true for staging/PR environments
Journey Context:
A development team configures a unified deployment workflow that handles both staging previews \(on pull requests\) and production releases \(on main branch pushes\). To prevent resource conflicts and unnecessary queueing, they add a concurrency block: concurrency: group: deploy-$\{\{ github.head\_ref \|\| github.ref \}\} cancel-in-progress: true. This works well for PRs: when a developer pushes new commits to a PR branch, the previous stale deployment cancels immediately. However, one day two developers merge PRs to main within minutes of each other. The first merge triggers a production deployment that begins running database migrations and rolling out containers. While it's halfway through \(minute 6 of 10\), the second merge triggers the same workflow. Because github.ref is refs/heads/main for both, they share the concurrency group deploy-refs/heads/main, and cancel-in-progress: true causes GitHub Actions to immediately terminate the first production deployment. The production environment is left in a half-migrated, inconsistent state causing an outage. The team investigates the "Canceling since a higher priority waiting request exists" message in the logs, realizes the concurrency group logic that prevents PR queueing is lethal for production continuity. They refactor the workflow to use conditional concurrency: for production \(if: github.ref == 'refs/heads/main'\), they use concurrency: group: production-deploy cancel-in-progress: false, ensuring production deployments never cancel each other and queue instead. For staging/PRs, they keep the aggressive cancellation with unique branch-based groups.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T08:35:03.263480+00:00— report_created — created