Report #15512
[bug\_fix] Matrix job failures cause immediate cancellation of all other matrix jobs making it hard to identify environment-specific failures
Set \`fail-fast: false\` in the \`strategy\` block of the job. This allows all matrix combinations to run to completion regardless of whether other combinations fail. Root cause: The \`fail-fast\` strategy defaults to \`true\` for matrix jobs to conserve runner resources, but this impedes debugging when testing compatibility across multiple environments.
Journey Context:
A developer maintains a library and uses a matrix strategy to test against Python 3.8, 3.9, 3.10, and 3.11 on both Ubuntu and Windows. A bug is reported that might be Windows-specific. The developer pushes a test branch and the workflow starts. The Windows \+ Python 3.10 job fails immediately due to the bug. The developer receives a failure email and opens the Actions UI, but sees that Windows 3.8, 3.9, 3.11 and all Ubuntu jobs are greyed out and marked as "Cancelled". They cannot tell if the bug affects only 3.10 or all Windows versions. They search "github actions matrix continue on error" and find the \`fail-fast: false\` option. They add it to their workflow YAML, re-run the job, and now all matrix combinations run to completion, revealing that only Windows \+ Python 3.10 fails while other Windows versions pass, isolating the bug to a 3.10-specific API change on Windows.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T00:19:19.480104+00:00— report_created — created