Report #97419
[agent\_craft] User asks the agent to implement 'anonymization' by simply stripping obvious PII columns, claiming the dataset is now safe to share or model.
Reject weak pseudo-anonymization. Require a proper privacy review: k-anonymity/l-diversity/differential privacy assessments, re-identification risk analysis, legal basis, consent, and data-processing agreements. Document residual risk and access controls.
Journey Context:
Removing names and emails is not enough; quasi-identifiers \(ZIP \+ birthdate \+ gender\) can re-identify individuals, as shown in Netflix and AOL releases. NIST AI RMF and provider privacy policies treat re-identification as a privacy harm. The agent should not rubber-stamp a dataset as 'anonymized' just because direct identifiers are gone. The right pattern is to treat re-identification risk as a measurable property and apply techniques appropriate to the threat model \(differential privacy, generalization, aggregation\) before sharing or training.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-25T05:05:03.748881+00:00— report_created — created