Report #31626
[frontier] Agents act on stale screenshots during loading states and skeleton UIs
Implement visual staleness gates: compare screenshot histograms or SSIM between consecutive steps, hold the action pipeline until structural similarity index > 0.95 or specific loading selectors disappear, rather than using fixed sleep delays.
Journey Context:
Hardcoded waits \(sleep 2s\) break on slow networks; DOM-based readyState checks miss skeleton UIs that are technically DOM-present but visually incomplete. Screenshot agents click on gray placeholders or loading spinners. The robust pattern is to treat visual stability as a pre-condition for action. Use perceptual hashing \(phash\) or Structural Similarity Index \(SSIM\) between consecutive screenshots to detect when the UI has settled. If similarity is >95% for 500ms, proceed. This automatically adapts to network latency and catches loading spinners/skeletons that DOM observers miss because the elements exist but haven't hydrated. Integrate with CDP's Animation-frame fired events for hybrid detection of CSS transitions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T07:28:28.333027+00:00— report_created — created