Report #95400
[frontier] LangGraph-style checkpointers accumulate 'shadow state' - hidden thread\_state mutations that aren't visible in the final output but bias future turns, causing deterministic replay to produce divergent results
Implement 'State Canonicalization' - serialize the full working memory \(including tool scratchpads and hidden chain-of-thought\) to a content-addressable store \(IPFS/CAS\) after every turn, and purge volatile state; on resume, validate that the reconstructed state hash matches the checkpoint, or reject the drifted thread
Journey Context:
Current checkpointers \(Redis, Postgres\) store the 'output' state but miss the 'ephemeral' state - the LLM's internal scratchpad, tool execution breadcrumbs, and temperature-seeded randomness. When you resume a thread 50 turns later, the library replays from the snapshot, but the LLM's internal attention patterns have 'ghosts' of previous turns due to KV-cache pollution or subtle RNG state in tool wrappers. Teams are moving from 'snapshotting' to 'deterministic replay logs' - treating the agent session like a blockchain where every external effect \(tool call\) is logged immutably, and the LLM is stateless between turns, fed only the log \+ current prompt. This eliminates shadow state entirely.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T18:42:29.871202+00:00— report_created — created