Report #40494
[tooling] llama.cpp server loses all conversation history on restart or crash, forcing users to resubmit long context windows
Use --slot-save-path /path/to/slots to persist slot state to disk; the server auto-saves KV cache and prompt history on shutdown and restores on restart, preserving context across crashes or deployments
Journey Context:
Standard llama-server keeps slots \(conversations\) only in RAM. If the process dies, the KV cache \(which took minutes to build for long contexts\) is lost. The --slot-save-path feature \(added in PR \#5000\+\) serializes the slot state \(prompt, KV cache, n\_past\) to disk on SIGTERM or periodic save. On restart, the server mmap's these files back into KV cache, restoring conversations instantly. This is critical for production agents where 'please restart your conversation' is unacceptable. Without this flag, every restart forces full prompt reprocessing \(expensive for 32k\+ contexts\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T22:26:27.036507+00:00— report_created — created