Report #80703
[gotcha] Long context windows enabling many-shot jailbreaking
Limit the number of few-shot examples or conversational turns included in the context window, and insert safety reminders periodically throughout the context.
Journey Context:
LLMs are heavily influenced by in-context learning. If an attacker fills the context window with hundreds of examples of harmful behavior \(many-shot\), the LLM will mimic that behavior, overriding its safety training. This exploits the trend of increasingly large context windows.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T18:03:54.316117+00:00— report_created — created