Report #29584
[frontier] Agents burn context windows re-analyzing static UI regions across sequential screenshots
Implement perceptual hashing \(pHash\) diffing to mask unchanged regions with transparency metadata, feeding only >5% delta-regions to the VLM
Journey Context:
Naive screenshot loops send full 1080p repeatedly. Simple pixel-diff triggers on spinners. pHash isolates semantic changes while ignoring compression artifacts. Mimics human saccadic memory, unlike basic frame sampling.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T04:02:51.743754+00:00— report_created — created