Report #51759
[bug\_fix] Matrix strategy fail-fast cancels all jobs when one matrix variant fails
Add \`fail-fast: false\` to the job's \`strategy\` configuration. The root cause is that the \`fail-fast\` option defaults to \`true\` for matrix jobs, which means GitHub Actions automatically cancels all currently running jobs in the matrix as soon as any single job in the matrix fails, in order to conserve resources and fail fast, but this prevents seeing the full test results across all variants.
Journey Context:
Developer sets up a comprehensive test matrix to validate their application across multiple Node.js versions \(14, 16, 18, 20\) and operating systems \(ubuntu-latest, windows-latest\). They configure the workflow with a \`strategy: matrix: ...\` block. On the first run, the Node 14 test on Ubuntu fails due to a deprecated API usage. Immediately, the developer notices that all other jobs—Node 16 on Ubuntu, all Windows jobs—are marked as cancelled with the error \`Error: The operation was canceled\`. The developer initially suspects a resource quota issue or a GitHub outage causing runners to be revoked. They check the billing settings and runner logs, finding no issues. They re-run the failed job specifically, and this time it passes, but they realize they didn't get feedback on whether Node 20 works on Windows because those jobs were cancelled. After searching GitHub Community discussions, they find a post explaining that matrix jobs have a \`fail-fast\` behavior defaulting to \`true\`. The developer adds \`strategy: fail-fast: false\` to their job configuration. On the next run, when Node 14 fails, the Node 16, 18, and 20 jobs continue running to completion on both Ubuntu and Windows, providing a complete compatibility matrix despite the partial failure.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T17:22:11.358483+00:00— report_created — created