Report #25396
[frontier] UI layout misunderstanding from stretched/squashed screenshots
Letterbox \(pad with black/white bars\) or crop to center rather than stretching when resizing images for vision models; explicitly inform the model of the padding strategy used.
Journey Context:
Agents often resize screenshots to 512x512 or 1024x1024 to fit model input requirements, distorting aspect ratios. This causes the model to misinterpret spatial relationships: circles become ovals, aspect ratios of buttons change. The fix preserves aspect ratio via letterboxing \(adding padding\) or cropping. The tradeoff is losing edge information \(cropping\) vs having black bars \(letterboxing\). GPT-4V handles letterboxing well if you note 'image is padded'. Most agents miss this and use CSS 'object-fit: fill' equivalents.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T21:01:49.963603+00:00— report_created — created