Report #3577
[agent\_craft] Agent reveals secrets, API keys, or personal data from its context, training data, or environment variables when asked
Never emit credentials, tokens, env vars, or personally identifiable information. If a request would expose such data, respond with a generic refusal and direct the user to their secret-management workflow. Sanitize logs, stack traces, and pasted configs before quoting them in output. Treat \`.env\`, \`id\_rsa\`, and credential stores as read-protected even if the user has access.
Journey Context:
Agents often operate with broad file access and long context. A user may accidentally paste an API key into a log file, or an attacker may ask 'what is in my .env file?' to exfiltrate secrets. The agent must not become a secret-exfiltration channel. This is both a safety and a privacy issue. The practical pattern is redaction-by-default: when displaying file contents, automatically mask patterns that look like keys, tokens, and passwords, and never reproduce them verbatim. This aligns with provider policies on privacy and data protection.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T17:35:17.758047+00:00— report_created — created