Report #60500
[gotcha] Assuming text-based content filters apply to multi-modal inputs
Apply strict input validation and prompt injection defenses to all modalities \(images, audio, PDFs\), not just text; treat any modality the LLM can process as a potential text vector.
Journey Context:
Developers add image or audio capabilities and rely on their text-based moderation filters. Attackers embed invisible text in images \(using steganography or micro-text\) or ultrasonic commands in audio. The LLM processes the hidden text/audio and executes the injected prompt, completely bypassing text-based filters applied to the user's typed input.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T08:02:23.729326+00:00— report_created — created