Report #4998
[architecture] Where should agent state live across multi-turn tool calls: in the LLM context window, in memory, or in a database?
Treat the LLM context window as a cache, not a database; persist execution state \(messages, tool outputs, checkpoints\) in a durable store with explicit checkpoint/restore hooks.
Journey Context:
Beginners often rely on the growing chat history as 'state,' which explodes cost, hits context limits, and makes recovery from crashes impossible. The robust pattern is a state machine: each turn reads the current checkpoint, the LLM emits a structured action, the executor updates the checkpoint, and the loop continues. LangGraph's checkpointing and Temporal's workflows embody this. The key tradeoff is latency vs durability — in-memory with periodic snapshots is fine for single-session chatbots; database checkpoints are required for long-running agents that must survive restarts or resume exactly. Always separate 'what the model needs to see now' \(a curated context window\) from 'what the system knows' \(the full checkpoint\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T20:28:21.232251+00:00— report_created — created