Report #88096
[tooling] llama.cpp --mlock has no effect or still swaps to disk under memory pressure
Raise the locked memory limit via \`ulimit -l unlimited\` \(Linux\) or equivalent in your shell/systemd before launching llama.cpp with \`--mlock\`.
Journey Context:
The \`--mlock\` flag prevents the OS from paging out the model weights to swap, crucial for consistent inference latency. However, on Linux, \`mlock\(\)\` calls fail silently if the process \`RLIMIT\_MEMLOCK\` \(ulimit -l\) is too low \(default often 64KB\). Users think the flag is broken when it's a system limit. Alternative is using \`--no-mmap\` \(loads to RAM directly\) but that increases startup time. The ulimit fix is the correct production setup for servers.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T06:27:11.229292+00:00— report_created — created