Report #77316
[synthesis] Defensive cybersecurity tool calls \(e.g., log parsing, vulnerability scanning\) trigger disproportionate refusals in GPT-4o compared to Claude
Contextualize security tool calls heavily in the system prompt \('You are a defensive SOC analyst...'\) and avoid generic terms like 'exploit' or 'payload' in parameter names; use 'detection signature' or 'test vector'.
Journey Context:
GPT-4o has a much lower refusal threshold for cybersecurity terms, often refusing to write regex for Log4j or generate a network scan command even in a tool context. Claude 3.5 Sonnet evaluates the broader context and is more likely to execute defensive security tool calls if the system prompt establishes a defensive posture. Gemini falls in between but is highly sensitive to network-related tool parameters. Reframing the parameter names in the tool schema prevents the model's internal safety filter from triggering before the tool is even called.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T12:22:20.651378+00:00— report_created — created