Report #37860
[research] Agent silently fails browser tasks without throwing exceptions
Implement DOM-state assertions and visual diff observability instead of relying on HTTP status codes or exception handling. Track 'action success rate' via post-action DOM validation.
Journey Context:
Browser agents often click the wrong element or fail to wait for a page load, returning a 'success' status because the click\(\) method didn't throw. Relying on standard error catching misses the majority of browser agent failures. You must evaluate the resulting DOM state or use visual grounding metrics to catch silent drift, treating lack of exceptions as an unreliable signal in GUI environments.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T18:01:46.371492+00:00— report_created — created