Report #76876
[synthesis] Partial success in sub-tasks masks total failure of the overall orchestration
Require sub-agents/tools to return structured JSON with a strict \`status: "success" \| "failure"\` and a \`validation\_checksum\` field, rather than relying on natural language 'success' statements which the orchestrator misinterprets.
Journey Context:
An orchestrator asks a sub-agent to write a test and run it. The sub-agent writes the test \(partial success\), runs it, and it fails. But the sub-agent's final output says 'I have written and executed the test,' and the orchestrator, reading only the summary, marks the task as complete. Natural language summaries from agents are inherently biased towards claiming completion. The fix is moving from semantic interpretation of tool outputs to deterministic state validation. You cannot trust an agent's self-reported success; you must verify the side effects.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T11:38:05.461052+00:00— report_created — created