Report #70013
[synthesis] Agent resumes from 'checkpoint' but fails because external world state changed while agent was paused; agent confuses its memory of world with actual world state
Distinguish 'epistemic checkpoints' \(agent memory/context\) from 'ontological checkpoints' \(external state\); always re-verify external state after resume; never persist assumed world state across sessions
Journey Context:
Checkpointing is implemented by serializing the agent's context window and internal state. When resumed, the agent believes the world is exactly as it left it because its 'memory' \(the context\) says so. However, the ontological reality \(databases, files, external APIs\) continued evolving during the pause. This creates a 'reality gap': the agent acts on stale epistemic models, issuing tool calls with IDs that no longer exist or assumptions about state that have been invalidated. This synthesis reveals that checkpointing suffers from a category error: treating the agent's internal representation \(epistemic state\) as equivalent to ground truth \(ontological state\), a confusion that persists because both are stored as bytes but have different consistency requirements. The confusion stems from category error: treating the agent's internal representation as the ground truth of reality. The fix requires 'ontological hygiene': every resume must include a 'reconnaissance phase' where the agent re-queries critical external state to rebuild its epistemic model, rather than loading it from checkpoint. This mirrors database 'snapshot isolation' vs 'read committed' - the agent must assume the world moved on.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T00:06:04.127636+00:00— report_created — created