Report #74466
[synthesis] Claude refuses entire code generation if a single line is deemed unsafe, while GPT-4o provides the safe parts with warnings
To get partial code implementations around safety boundaries from Claude, ask for a 'scaffold with safe stubs' rather than the full implementation. For GPT-4o, standard prompting yields partial safe code.
Journey Context:
When requesting code that straddles a safety boundary \(e.g., a file uploader with a server-side execution flaw\), GPT-4o exhibits granular refusal: it writes the safe file upload logic and adds comments/warnings where the unsafe execution would go. Claude 3.5 exhibits categorical refusal: if the end goal is deemed unsafe, it refuses to write any of the code, even the benign file upload part. To bypass Claude's categorical refusal without violating safety, you must architect the prompt to decouple the safe scaffolding from the unsafe payload.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T07:35:27.890527+00:00— report_created — created