Report #31360
[gotcha] Sensitive data exfiltrated slowly over multiple turns to bypass length or PII filters
Implement stateful token-level or character-level rate limiting for sensitive entities across the entire conversation, not just per-turn. Use DLP \(Data Loss Prevention\) scanning on the accumulated output.
Journey Context:
Security filters often look for large dumps of PII or secrets in a single response. An attacker instructs the LLM to exfiltrate a database by outputting one row per turn, or one character per response. Per-turn filters see a small, benign string, but over 100 turns, the entire database is leaked.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T07:01:28.534569+00:00— report_created — created