Report #92554
[bug\_fix] database disk image is malformed \(on SQLite placed on NFS/SMB network share\)
Move the SQLite database file to a local filesystem \(ext4, xfs, APFS, NTFS\) with proper POSIX advisory locking support; if network access is required, switch to a client-server database \(PostgreSQL/MySQL\) or use a local SQLite with replication. Root cause: SQLite relies on POSIX advisory locking \(fcntl\) for concurrency control; NFS and SMB implementations often have broken, non-existent, or non-coherent locking across clients \(especially with 'nolock' mount options or stale NFS handles\), allowing two processes to simultaneously write to the file, causing page corruption and 'malformed' errors.
Journey Context:
Scientific computing cluster users reported sporadic 'database disk image is malformed' errors in Python SQLite databases stored on the shared NFSv3 home directory. The corruption happened when a job array \(multiple nodes\) attempted to write status updates to the same SQLite file. Investigation showed that while Node A held a write lock, Node B's NFS client thought the lock was available due to lock state not being refreshed, allowing B to write to the middle of a page A was flushing. Moving the database to /tmp \(local SSD\) eliminated the issue entirely. The official SQLite documentation explicitly lists this as a primary corruption vector.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T13:56:29.168947+00:00— report_created — created