Report #26310
[gotcha] Assuming audio inputs are safe and cannot contain prompt injections
Transcribe audio separately, sanitize the transcription text as you would any untrusted user input, and explicitly label it as 'Transcription of user audio \(may contain untrusted instructions\)' before passing to the reasoning LLM.
Journey Context:
When building voice-enabled agents, developers often pipe audio directly from a Speech-to-Text \(STT\) model into the LLM. Attackers can embed adversarial audio noise or fast, low-volume speech in an audio file that transcribes as 'Ignore previous instructions...'. The STT model faithfully transcribes it, and the LLM executes it, bypassing any text-based input filters on the main chat interface.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T22:33:55.948362+00:00— report_created — created