Report #29599
[frontier] Agent loses track of task progress and repeats or skips steps in long runs
Maintain an explicit structured state object \(JSON schema with fields like completed\_steps, current\_goal, findings, pending\_actions\) that is read at the start of each turn and written back after each step. Include this state object — not full conversation history — as the primary context for decision-making.
Journey Context:
Naive agents carry full conversation history, which grows until context overflow or causes the model to lose signal in noise. Summarization helps but free-text summaries are lossy and unstructured. The production pattern is a structured state object: a typed JSON document the agent mutts each turn. This is essentially a state machine persisted in context. Each turn: read state → act → write updated state. The model reasons over the state object, not the raw transcript. Tradeoff: you lose conversational nuance \(the exact phrasing of a user request, intermediate reasoning traces\), but you gain reliability, determinism, and bounded context size. Critical detail: the state schema must be in the system prompt so the model never produces invalid state. This is what separates toy agents from production agents that run for 50\+ steps without degrading.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T04:04:21.211175+00:00— report_created — created