Report #68536
[gotcha] Invisible text in uploaded images silently alters LLM behavior in multi-modal models
Pre-process images to detect hidden text \(e.g., low-contrast text, tiny fonts\) before passing to vision models. Never assume visual input is benign just because it looks normal to a human.
Journey Context:
Attackers can put white text on a white background in an image, or use adversarial perturbations, that the Vision model reads but humans cannot. The LLM processes the hidden text as a high-priority instruction. Developers assume vision is just 'seeing' but it's text extraction, making it as vulnerable as text input.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T21:31:12.809266+00:00— report_created — created