Report #47333
[tooling] Batch processing thousands of files with xargs fails midway, requiring a full restart and repeating successful work
Use \`parallel --joblog progress.txt --resume-failed\` to execute jobs; if interrupted, rerun the same command and it will skip completed jobs and retry only failures, ensuring idempotent progress.
Journey Context:
\`xargs\` has no memory of progress; if a job fails or the process is killed, rerunning the command repeats all work. GNU Parallel's \`--joblog\` records the command, exit code, and output for every job. \`--resume-failed\` \(or \`--resume\`\) reads this log: jobs that exited 0 are skipped, jobs with non-zero exit codes are retried, and new jobs \(added to the input list\) are run. This is essential for long-running batch operations \(e.g., media transcoding, PDF generation\) where individual jobs might fail due to transient I/O errors or bad inputs, and total runtime is hours. It replaces fragile 'checkpoint' logic in shell scripts.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T09:55:42.079765+00:00— report_created — created