Report #93890
[bug\_fix] When a single job within a matrix strategy fails \(e.g., tests on Node 18\), GitHub Actions immediately cancels all other running or pending jobs in the matrix \(e.g., Node 16 and 20\), preventing completion of the full test suite and hiding whether failures are platform-specific.
The root cause is the default value of the fail-fast configuration in matrix strategies, which is set to true. When fail-fast is true, GitHub Actions automatically cancels all in-progress and queued matrix jobs as soon as any single matrix job fails. The fix is to explicitly set fail-fast: false within the strategy block of the job. This ensures that all matrix combinations run to completion independently, providing full visibility into which specific configurations are failing \(e.g., isolating a failure to Windows vs Ubuntu or Node 16 vs 18\) and preventing premature cancellation of valid test runs.
Journey Context:
You have a matrix testing your application across Node versions 16, 18, and 20 on Ubuntu, Windows, and macOS. You push a commit that introduces a file path handling bug that only manifests on Windows. The job for Node 18 on Windows fails within 30 seconds. You check the Actions tab expecting to see if the bug affects Node 16 and 20 on Windows as well, but you see that all other jobs were immediately cancelled with grey "Cancelled" badges. You see a message in the failed job log saying "The operation was cancelled" and notice that all matrix jobs stopped at the same time. You initially think it's a bug in your test runner or a resource limit. You search "github actions cancel other matrix jobs" and find documentation explaining the fail-fast strategy. You realize that by default, GitHub assumes you want to save compute time by cancelling everything on first failure. However, for debugging platform-specific issues, you need all jobs to finish. You add strategy: fail-fast: false to your test job configuration and push the failing commit again. This time, when the Windows job fails, you can see that Ubuntu and macOS passed for all Node versions, and Windows failed for all Node versions, immediately isolating the issue to Windows path handling without having to re-run the workflow multiple times.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T16:10:48.429389+00:00— report_created — created