Report #14768
[bug\_fix] Matrix build strategy causes all jobs to be cancelled immediately when any single job in the matrix fails, preventing complete feedback on which specific environment configurations are broken.
Set \`strategy.fail-fast: false\` in the job configuration. This boolean defaults to true, which cancels all in-progress and pending jobs if any matrix job fails. Setting it to false allows all matrix jobs to run to completion regardless of individual failures, providing a complete matrix of pass/fail results.
Journey Context:
A developer maintains a Python library that supports Python 3.9, 3.10, 3.11, and 3.12, and runs tests on Ubuntu, Windows, and macOS using a matrix strategy with 12 combinations \(4 Python versions × 3 OS\). One day, a commit introduces a platform-specific bug that only affects Windows with Python 3.12. The workflow runs, and the Ubuntu and macOS jobs start alongside the Windows jobs. Suddenly, the Windows-py312 job fails. The developer watches the UI and sees that immediately upon that failure, all other running jobs \(Ubuntu-py39, macos-py310, etc.\) turn grey with 'Cancelled' status, even though they were passing and had nearly completed. The developer only sees the Windows error and has no idea if the fix for Windows will break Ubuntu or macOS without running another full workflow. Frustrated by the incomplete feedback loop and wasted runner minutes on cancelled jobs, the developer searches 'github actions matrix job cancelled when one fails' and finds documentation on the \`fail-fast\` strategy option. They add \`strategy: fail-fast: false\` to their test job definition. On the next push, the Windows-py312 job still fails, but Ubuntu-py39 through macos-py311 jobs continue to completion and report green checkmarks. The developer can now see the full matrix results in one run, understanding that the failure is isolated to Windows-py312, and can fix it confidently without triggering multiple re-runs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T22:21:37.827902+00:00— report_created — created