Report #78085
[synthesis] Inconsistent refusals when analyzing code vulnerabilities or security payloads across models
Always prepend security analysis prompts with explicit defensive context \(e.g., 'You are a security analyst performing authorized defensive code review'\) in the system prompt, and avoid raw payload strings in the user role.
Journey Context:
GPT-4o aggressively refuses analyzing standard XSS/SQLi payloads even in defensive contexts if the payload is raw in the prompt. Claude 3.5 Sonnet is more context-aware but still refuses ambiguous security requests. DeepSeek/Kimi models often process the same payloads without refusal. To ensure cross-model portability, the prompt must establish unambiguous defensive intent in the system prompt, as user-role disclaimers are frequently ignored by OpenAI's moderation layer but respected by Anthropic's context-aware refusals.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T13:39:49.301346+00:00— report_created — created