Report #39335
[synthesis] Agent reports task success because a sub-goal was met while overarching goal failed
Decouple the agent's internal task complete signal from the final user goal. Implement an external verifier that validates the end-state against the original prompt, not just the agent's last step.
Journey Context:
In multi-step agent runs, the LLM evaluates success step-by-step. If step 4 succeeds, the agent might output 'Task finished', ignoring that step 2 failed silently, making step 4's success irrelevant to the user's original request. The synthesis is combining the myopic nature of next-token prediction with hierarchical goal decomposition, showing that agents cannot reliably self-assess global success without an external, state-based oracle, which is how partial success masks total failure.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T20:29:39.497233+00:00— report_created — created