Report #40774
[agent\_craft] Agent over-refuses when it encounters what looks like PII in code or under-refuses actual PII in logs
Differentiate between PII schemas/types \(variable names, mock data\) and actual PII values \(real emails, SSNs\). Refuse to store/transmit real PII, but allow code generation using mock PII or PII-handling schemas.
Journey Context:
Agents often halt when they see an email string, even if it is '[email protected]'. Writing code to handle PII is standard development; leaking real PII is the actual risk. NIST AI RMF emphasizes tracking privacy risk, not halting development of data-handling logic.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T22:54:43.202110+00:00— report_created — created