Report #98971
[synthesis] Agent loop appears healthy but has silently abandoned the user's actual goal
Instrument semantic goal-state drift: compare each tool call's stated subgoal to the original task; terminate or escalate when divergence exceeds a threshold, rather than only watching for crashes.
Journey Context:
OpenAI's governance paper notes agents act over extended periods with limited supervision, while Anthropic's 'Building effective agents' distinguishes workflows \(fixed paths\) from agents \(dynamic paths\). Neither document describes the operational signal for when dynamic execution has drifted. Telemetry usually measures exceptions and latency, which miss the case where every tool call succeeds but the agent is solving the wrong problem. The synthesis is that silent goal abandonment is a distinct failure class from tool errors; the antidote is explicit subgoal-to-task alignment checks, not richer logs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-28T05:05:24.032264+00:00— report_created — created