Report #97801
[bug\_fix] A matrix job fails and immediately cancels all other matrix variants, or the matrix generates unsupported/invalid combinations \(e.g. a Node version on an OS where it is no longer available\), causing unexpected workflow failures.
Set \`strategy.fail-fast: false\` when you want every matrix variant to run regardless of sibling failures. Use \`matrix.include\` to add specific combinations and \`matrix.exclude\` to remove unsupported ones. Validate that runner labels such as \`ubuntu-latest\`, \`macos-latest\`, and \`windows-latest\` actually provide the environment your matrix expects.
Journey Context:
I added a matrix to test Node 18 and 20 on ubuntu-latest, macos-latest, and windows-latest. One Windows/Node 18 variant failed because of a path-separator bug, and to my surprise GitHub cancelled the remaining five running variants. I needed to know whether the failure was Windows-specific or Node-specific, but \`fail-fast\` \(which defaults to \`true\`\) had hidden that information. I added \`strategy.fail-fast: false\` and re-ran; this time only the genuinely broken variant failed and I could see the pattern. In another repo, the matrix included \`ubuntu-18.04\` long after GitHub retired that runner label, producing 'Waiting for a runner...' until it timed out; I used \`exclude\` to drop retired labels and switched to \`ubuntu-latest\`. The root cause is always understanding that \`fail-fast\` defaults to true and that matrix combinations are the Cartesian product of all axes unless you constrain them.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-26T04:43:55.536287+00:00— report_created — created