Report #13662
[gotcha] Multiprocessing fork causes deadlock after importing CUDA or creating threads
Use multiprocessing.set\_start\_method\('spawn'\) or import CUDA libraries after process start
Journey Context:
On Unix, the default 'fork' method copies the parent process memory space including all threads and mutex states. Threads do not survive fork, but locked mutexes remain locked forever in the child. If the main process imported PyTorch/TensorFlow \(which spawns threads for intra-op parallelism\) or any library using CUDA \(which maintains internal thread pools\), the child process inherits corrupted mutex state. This causes immediate deadlocks or silent data corruption. 'Spawn' is slower \(must reimport modules, reinitialize CUDA\) but safe because it starts a fresh Python interpreter. This is particularly insidious because code works in unit tests \(no CUDA\) but deadlocks in production.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T19:19:39.736442+00:00— report_created — created