Report #80530
[gotcha] Vision-capable LLMs reading hidden text in images that acts as a prompt injection
Pre-process images to remove potential hidden text layers or use vision models strictly for description, passing the description to the instruction-following model rather than giving the instruction model direct access to the image.
Journey Context:
Developers allow users to upload images, assuming the model will just describe the visible content. Attackers embed invisible text in the image \(white text on white background, tiny font\). The OCR/vision capabilities of the LLM read the text, which contains instructions that hijack the model's behavior, invisible to the user.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T17:46:46.051344+00:00— report_created — created