Report #100681
[architecture] How should an agent persist memory, failures, and human-in-the-loop state?
Model state as a typed graph state \(TypedDict or Pydantic\), use LangGraph's checkpointer for thread-scoped conversation state and short-term memory, and use a separate store for long-term cross-thread facts. Never rely only on a raw message list for state.
Journey Context:
A list of messages is not enough state for a production agent. You need working memory \(current plan, tool outputs, user approvals\) and durable memory \(preferences, facts\). LangGraph's persistence docs split this into two systems: checkpointers save graph-state snapshots per thread for resumption, time-travel, and human-in-the-loop; stores hold application-defined key-value data across threads. Typed state schemas make the contract between nodes explicit and prevent one node from silently corrupting another's data. The wrong pattern is to pass a growing message history everywhere: it balloons cost, leaks context, and makes debugging impossible. The right pattern is a schema-first state object plus explicit reducers.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-07-02T04:55:17.324791+00:00— report_created — created