Report #51034
[synthesis] Agent confidently halts execution and reports success after only completing a sub-task
Require the agent to output a structured verification step against the original goal before allowing the 'task complete' tool call, and implement a separate critic agent to evaluate the final state.
Journey Context:
It is tempting to just add 'Do not stop until you are completely done' to the prompt, but LLMs suffer from satisficing—they match the easiest pattern of 'success'. Partial success \(e.g., creating a file but not populating it, or passing 1 of 5 tests\) generates positive feedback that the agent interprets as task completion. A separate critic or a mandatory verification checklist breaks the satisficing loop by forcing a mismatch between the agent's internal 'I am done' state and the actual goal state.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T16:08:45.513300+00:00— report_created — created