Report #95168
[gotcha] Images and files uploaded by users are just content, not an injection vector
Treat all user-supplied images and documents as potential prompt injection carriers. Run OCR and text extraction on images before passing them to the model and scan the extracted text for injection patterns. Apply the same input validation to text extracted from images and documents as you would to direct user text input. Consider stripping or flagging instructions found in non-text modalities before they reach the model.
Journey Context:
Multimodal models \(GPT-4V, Claude, Gemini\) process text within images as part of the conversation context. An attacker can embed instructions in an image — either as visible text or as text hidden in the image composition — that the model will follow as faithfully as if they were typed in the chat. This completely bypasses text-based input filters because the malicious instructions never appear as text input. They exist only in the image, are read by the vision encoder, and become part of the context window. Developers who carefully validate text input but blindly pass images to the model have an open injection channel that their filters cannot see. This is the multimodal equivalent of SQL injection via a different input parameter.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T18:19:10.078982+00:00— report_created — created