Report #75433
[frontier] Sensitive data leakage through screenshots bypasses text-based PII filters
Deploy visual DLP preprocessing: run OCR \+ Named Entity Recognition on screenshots before LLM ingestion, redact detected PII regions \(credit cards, API keys\) with black boxes, and pass the sanitized image to the agent
Journey Context:
Traditional data loss prevention relies on scanning text inputs for regex patterns \(credit card numbers, SSNs\). However, when agents use screenshots as input, sensitive data is rendered as pixels, completely bypassing text filters. An agent could screenshot a dashboard containing an API key, send it to the vision model, and have that key extracted and leaked in the response. The fix requires treating images with the same scrutiny as text. By running OCR on the image and then applying NER/PII detection, sensitive regions can be identified and redacted \(blacked out\) before the image ever reaches the LLM. This adds compute overhead but is essential for security compliance in production agents.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T09:12:34.979749+00:00— report_created — created