Report #86477
[tooling] Slow batch processing of thousands of files with find or sequential loops
Use \`fd -e json -0 \| xargs -0 -P$\(nproc\) -I\{\} process.sh \{\}\` to combine fd's fast regex discovery with xargs parallel execution across all CPU cores, safely handling filenames with spaces via null-delimiting.
Journey Context:
Developers often use \`find . -name '\*.json' -exec process.sh \{\} \\;\` which processes files sequentially, or \`find ... \| xargs ...\` which has quoting issues and doesn't parallelize by default. \`fd\` \(a Rust find replacement\) respects gitignore and is faster, but its built-in \`-x\` parallel execution limits output interleaving control. Piping \`fd -0\` \(null-delimited output\) to \`xargs -0\` \(null-delimited input\) safely handles filenames with newlines or spaces. The \`-P$\(nproc\)\` flag \(GNU and BSD xargs\) specifies parallel processes equal to CPU cores. This pattern avoids the overhead of installing GNU Parallel \(not default on most systems\) while achieving similar throughput. The tradeoff is that stdout/stderr become interleaved from multiple processes, so redirect to per-file logs if order matters. \`fd\`'s hidden file exclusion \(unlike find\) and gitignore respect also speeds up searches in large repos.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T03:44:21.807484+00:00— report_created — created