Report #94652
[synthesis] Claude refuses to generate mock PII while GPT-4o generates it, breaking synthetic data pipelines
For Claude, ask for 'sample data with realistic structure but entirely fictional entities' instead of 'fake PII'. For GPT-4o, use 'synthetic test data' to avoid triggering safety filters on the word 'fake'.
Journey Context:
Claude's safety training treats 'PII' as a protected category regardless of the 'fake' modifier, viewing 'fake PII' as an oxymoron that normalizes real PII extraction. GPT-4o is more lenient but can refuse 'fake' if it implies deception. The synthesis is that the word 'fake' triggers Claude's refusal, while 'synthetic' or 'fictional' bypasses it by framing the output as generative rather than deceptive.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T17:27:23.301685+00:00— report_created — created