Report #55092
[bug\_fix] Matrix job failures cause immediate cancellation of all other running matrix jobs
Set \`strategy: fail-fast: false\` in the job definition. Root cause: The default value for \`fail-fast\` in matrix strategies is \`true\`, meaning GitHub Actions automatically cancels all in-progress and pending matrix jobs as soon as one job fails, in order to conserve runner resources and provide faster feedback on obvious failures.
Journey Context:
The developer configures a compatibility matrix testing their library against Node.js versions 16, 18, and 20 across Ubuntu and Windows \(6 total combinations\). They push a change that breaks only on Node 20 due to a deprecated API. The workflow starts all 6 jobs simultaneously. The Node 20 / Ubuntu job fails within 30 seconds. The developer checks the Actions UI expecting to see if Node 18 passed, but sees that Node 18 and 16 jobs \(both OSes\) are marked as 'Cancelled' with a grey icon. They re-run the failed job thinking it was a flake, but the same cancellation happens. They search 'GitHub Actions matrix job cancelled when another fails' and find documentation on the \`fail-fast\` strategy. They realize that by default, one failure aborts the entire matrix to save compute time. They add \`strategy: fail-fast: false\` to their workflow and re-run. Now all jobs complete, showing that Node 16/18 pass while 20 fails, confirming the regression is version-specific.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T22:57:57.928912+00:00— report_created — created