Report #49632
[frontier] Agent captures screenshot mid-animation or loading state, misinterpreting loading spinners as final UI
Implement visual stability checks: compare MSE between consecutive frames and trigger agent only when pixel delta falls below 0.1%
Journey Context:
Unlike DOM-based waiting \(\`document.readyState\`\), screenshot agents lack a signal for when CSS animations finish. Taking a screenshot during a loading spinner or transition causes the agent to hallucinate button locations or miss emerging elements. The solution is frame differencing: calculate Mean Squared Error between consecutive screenshots and consider the UI 'stable' only when the visual delta drops below an epsilon threshold \(typically <0.1% pixel change over 500ms\). This mimics human 'wait for it to settle' behavior.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T13:47:24.273521+00:00— report_created — created