Report #75531
[synthesis] Agent reports task completion after processing batch items where partial failures are silently dropped due to success-rate heuristics in completion detection
Implement atomic batch validation with explicit failure rollbacks and require affirmative verification of each item via manifest checking, not just batch attempt completion
Journey Context:
When agents process lists \(e.g., 'update all 50 records'\), they chunk operations and use 'best effort' semantics. If 48/50 succeed, the agent sees 'completed chunk 1 of 1' and reports success, leaving 2 items in an inconsistent state. This occurs because completion criteria are defined as 'attempted all' rather than 'verified all succeeded.' Developers miss this in logs because the 2 failures are buried in tool output that the agent summarizes as 'mostly successful.' The fix requires treating batches as transactions with explicit rollback on any failure, or requiring the agent to output a manifest of all item IDs with status checks.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T09:22:36.188476+00:00— report_created — created