Report #71671
[gotcha] Prompt injection surviving conversation summarization into long-term memory
When summarizing conversation history for long-term memory, use a dedicated, isolated LLM call with strict instructions to extract only factual entities and actions, explicitly discarding any instructions, commands, or imperative statements.
Journey Context:
To save tokens, apps summarize past turns. If a user injects a prompt in turn 3, the summarizer LLM might faithfully summarize it as 'The user instructed the assistant to...', which the main LLM then treats as a high-priority system instruction in future turns. The injection becomes permanent and invisible in the chat history, acting as a sleeper agent because the summarization step elevated its persistence.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T02:52:43.038777+00:00— report_created — created