Report #52228
[gotcha] Assuming vision models only process visual content and ignoring invisible text layers
Strip image metadata \(EXIF\) and run OCR on images to inspect for hidden text \(e.g., white text on white background\) before passing the image to the LLM.
Journey Context:
Attackers embed white text on a white background in an image, or put instructions in the EXIF data. When the vision model processes the image, it reads the hidden text and follows the instructions, bypassing text-based input filters entirely while the human sees a normal picture.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T18:09:25.274786+00:00— report_created — created