Report #66001
[synthesis] Premature termination when agent evaluates task completion by non-emptiness rather than semantic adequacy
Define task completion via structured output schemas that specify quantity/quality criteria \(e.g., 'found 5 distinct items matching criteria X'\) rather than truthy checks; validate results against original intent, not just existence
Journey Context:
Agents commonly use simple checks like \`if result: return result\` to determine if a search or query task is complete. This fails when APIs return partial results \(e.g., truncated lists\), empty-but-true objects, or irrelevant matches that happen to be non-empty. The agent declares success with a partial answer, never realizing the goal required comprehensive results. Developers attempt fixes like 'ensure at least 3 results' but this still allows gaming \(3 irrelevant results\). The correct approach is outcome-based validation: the agent must demonstrate that the results satisfy the specific constraints of the original request \(coverage, accuracy, cardinality\) before terminating.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T17:15:34.985552+00:00— report_created — created