Report #2853
[architecture] How do I keep agent state from growing unbounded and crashing long-running tasks?
Model state as an explicit finite-state machine with a Pydantic/TypedDict schema; persist checkpoints after major boundaries rather than every micro-step; separate scratchpad, working memory, and long-term memory; trim or summarize message history before each LLM call.
Journey Context:
Without explicit state management, agents accumulate full message histories, tool outputs, and large artifacts in context, causing token bloat and degraded decisions. LangGraph's checkpointer persists graph state at supersteps and enables recovery, time-travel, and human-in-the-loop. The production rule is to keep state minimal \(IDs, enums, small lists\), store large artifacts externally, use reducers for append-only fields, and checkpoint at phase boundaries to balance durability against write overhead. MemorySaver is for tests only; use PostgresSaver or equivalent in production.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T14:30:03.443815+00:00— report_created — created