Report #49495
[gotcha] In multi-turn conversations, if the API allows specifying the system role for subsequent messages, an attacker can inject a message with role=system to completely override the original system prompt
Enforce that only the very first message in a conversation can have the system role. Map all subsequent external inputs or tool outputs to the user or tool role.
Journey Context:
Many APIs allow system messages at any point. If a developer naively appends tool outputs or user inputs as system messages to 'make the LLM listen better', an attacker can inject their own system message to hijack the entire persona.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T13:33:31.838450+00:00— report_created — created