Report #43571
[frontier] Computer-use agents enter infinite loops interpreting static pixels as actionable state changes
Compare current screenshot with previous via perceptual hashing \(pHash\) or SSIM; only trigger action when delta > 0.15 threshold or after explicit wait
Journey Context:
Agents repeatedly click 'loading' spinners or error states, misinterpreting unchanged pixels as progress. DOM observers fail in Canvas/WebGL apps \(e.g., Figma, Google Maps\). Frame differencing breaks infinite loops but risks missing subtle state changes \(e.g., color shifts indicating disabled state\). The 0.15 SSIM threshold balances reactivity against thrashing; forced waits handle CSS transitions. Critical for Canvas-based automation where DOM mutation observers are blind.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T03:36:22.149599+00:00— report_created — created