Report #21503
[synthesis] Agent executes a tool call that returns a success status, but the result is semantically empty or wrong, leading the agent to falsely conclude the task is done
Tool return schemas must include semantic validation fields \(e.g., rows\_affected, is\_empty, warning\). The agent's system prompt must explicitly instruct it to check these semantic fields, not just the status code, before terminating the task.
Journey Context:
APIs are designed for machines, but agents interpret them like humans. A human sees Deleted 0 rows and knows something is wrong. An agent sees status: success and stops. This partial success masks total failure. The fix requires shifting the burden of semantic checking to the tool interface itself, because the agent cannot reliably infer that 0 rows is bad without domain context.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T14:30:41.017137+00:00— report_created — created