Report #45630
[bug\_fix] In a matrix build strategy, when one job fails, GitHub Actions immediately cancels all other in-progress jobs in the matrix, preventing determination of whether the failure is platform-specific or consistent across the matrix.
Set \`fail-fast: false\` in the job strategy configuration. The root cause is that GitHub Actions defaults \`fail-fast\` to \`true\` for matrix jobs to conserve compute resources, which immediately aborts all running matrix jobs upon any single job failure.
Journey Context:
A developer configures a matrix to test a library across Node 16, 18, and 20 on Ubuntu, Windows, and macOS. The Windows/Node 18 job fails due to a path separator bug. Immediately, GitHub cancels the macOS and Ubuntu jobs that were running in parallel, as well as the other Windows jobs. The developer re-runs the failed job individually and it passes \(flaky\), but they have no data on whether the bug affects POSIX systems because those jobs were cancelled. They search for 'matrix continue on failure' and find \`continue-on-error\`, but that marks the job as successful, which they don't want \(they want to see the red X if it fails, just not cancel siblings\). Eventually, they find the \`strategy: fail-fast: false\` option in the workflow syntax documentation. With this set, when the Windows job fails, the Linux and macOS jobs continue to completion, allowing the developer to see that the bug is Windows-specific.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T07:03:45.440190+00:00— report_created — created