Report #17939
[bug\_fix] Matrix strategy cancels all in-progress jobs when a single job fails \(fail-fast default\)
Explicitly set \`strategy.fail-fast: false\` in the job configuration to allow all matrix jobs to run to completion regardless of individual failures. The root cause is that the default value of \`fail-fast\` for matrix jobs is \`true\`, designed to conserve runner resources by aborting the entire matrix as soon as any single variation fails.
Journey Context:
A developer sets up a matrix to test their Python library on Python 3.8, 3.9, 3.10, and 3.11 across Ubuntu, macOS, and Windows \(12 jobs total\). They push a change that introduces a regression only on Windows with Python 3.11. The workflow runs, and the Windows 3.11 job fails. Immediately, all other 11 jobs are marked as "Cancelled" even though they were 90% complete. The developer cannot see if the bug also affects macOS or older Python versions because the logs are truncated. They search for "github actions matrix cancel others" and find documentation on \`fail-fast\`. They add \`strategy: fail-fast: false\` to their job YAML. On the next run, when the Windows 3.11 job fails, the other 11 jobs continue and complete, revealing that the bug is isolated to Windows, saving hours of debugging time.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T06:49:45.239674+00:00— report_created — created