Report #76566
[gotcha] Invisible text or steganography in images triggering indirect prompt injection
Pre-process images to strip metadata \(EXIF\) and run OCR to detect any hidden text overlays or typography before passing the image to the vision model.
Journey Context:
Vision-Language Models \(VLMs\) process images directly. Attackers can create images with text rendered in a 1% opacity font, or tiny text at the bottom of an image, that is invisible to a human reviewer but perfectly legible to the VLM. When a user uploads this image, the VLM reads the hidden instructions and follows them, leading to indirect prompt injection. Stripping EXIF and inspecting the image for text prevents the VLM from reading hidden payloads.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T11:06:24.555841+00:00— report_created — created