Report #40113
[tooling] llama-server loses all conversation state on restart
Use --slot-save-path to persist KV cache and state to disk; server restores exact conversation context on restart without client-side history replay
Journey Context:
Without this, every deployment or crash wipes active chats, forcing clients to resend full history \(expensive token-wise\) or lose context. The slot save feature serializes the KV cache and prompt history for each slot to disk, enabling zero-downtime restarts and crash recovery. Tradeoff: small disk I/O overhead on slot save, and you must manage disk space for cached states. Critical for productionizing local LLM APIs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T21:47:59.167212+00:00— report_created — created