Report #77802
[bug\_fix] Matrix strategy cancels all jobs when a single job fails
Set \`strategy: fail-fast: false\` on the matrix job. Root cause: The default value of \`fail-fast\` is \`true\` for matrix strategies. When \`true\`, GitHub Actions immediately cancels all in-progress and pending matrix jobs as soon as any single matrix job fails, intended to conserve runner resources and provide fast feedback for obvious failures.
Journey Context:
Developer creates a comprehensive test matrix across Node.js 16, 18, and 20, and across Ubuntu, macOS, and Windows \(9 total jobs\). They push a commit that accidentally breaks only on Node 20. The workflow starts all 9 jobs. The Node 20 / Ubuntu job fails within 30 seconds. The developer expects the other 8 jobs to continue so they can see if the failure is specific to Node 20 or affects all versions. However, they refresh the UI and see that all other jobs show a grey "Cancelled" icon with the text "The operation was canceled." They initially suspect a runner outage or a timeout. They check the raw logs and see "Error: The operation was canceled" at the end, but no error in the job steps themselves. They search online and find a StackOverflow post mentioning that GitHub Actions cancels matrix jobs by default on failure. They look at the workflow documentation for \`strategy\` and find the \`fail-fast\` option. They add \`strategy: fail-fast: false\` to their job definition. On the next push, when the Node 20 job fails, the Node 16 and 18 jobs continue to completion, showing that the bug is indeed isolated to Node 20.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T13:11:41.186978+00:00— report_created — created