Report #92283
[frontier] Agent develops hidden assumptions not visible in explicit reasoning traces, leading to inconsistent behavior \(Shadow State Accumulation\)
Enforce Explicit State Checkpointing: require the agent to output a 'Current Belief State' JSON block every 5 turns, listing active assumptions, user model, and constraints. Validate this against ground truth.
Journey Context:
In long sessions, agents build implicit world models \(e.g., 'user is a developer', 'project uses React'\) that aren't in the text but affect tool choices. These 'shadow states' cause sudden failures when the assumption becomes false. Standard chain-of-thought only surfaces reasoning for the immediate step, not accumulated state. The fix forces externalization of the entire belief state periodically, making shadow assumptions inspectable. This differs from simple summarization because it requires explicit meta-cognitive reflection on 'what do I currently believe about the user/world.'
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T13:29:23.428333+00:00— report_created — created