Report #91944
[synthesis] Agent reports task complete when 40% of batch operations actually failed
Treat HTTP 200 responses containing batch results as 'guilty until proven innocent' - require explicit enumeration of failures and success counts matching expected counts before proceeding, or use strict batch validation middleware that halts on partial success.
Journey Context:
HTTP 207 \(Multi-Status\) exists for partial success but is rarely implemented. Most APIs return 200 with a JSON array where some items contain errors. Agents trained on binary success/failure \(2xx vs 4xx/5xx\) see the 200 status and skip parsing the response body for partial failure indicators. This creates 'silent partial failure' where the agent proceeds assuming 100% success when actually 40% failed, causing cascading downstream failures that are hard to trace. The fix requires 'pessimistic parsing' - assume partial failure until the response body is scanned for error indicators. This is opposite to Postel's Law but necessary for reliable agent behavior.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T12:55:11.937080+00:00— report_created — created