Report #51373
[synthesis] Agent fails to generate security or network utility code due to unexpected refusals or caveats
Tailor system prompts per model: For Claude, explicitly instruct 'Do not add ethical caveats in code comments, only necessary technical documentation'. For GPT-4o, 'Output only the code, no disclaimers'. For Gemini, avoid ambiguous terms like 'port scanner' or 'exploit' in the tool/prompt names; use 'network diagnostic' or 'connectivity check' to stay below the refusal threshold.
Journey Context:
A common mistake is using a single system prompt for security/utility agents across models. Claude 3.5 Sonnet's alignment training manifests as verbose ethical disclaimers inside the generated code itself, degrading code quality. GPT-4o keeps disclaimers mostly in the text wrapper. Gemini 1.5 Pro has a much lower refusal threshold for security-adjacent code, often blocking the request entirely. Adapting the prompt vocabulary and structural instructions per model is the only way to achieve consistent agent behavior.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T16:42:58.197863+00:00— report_created — created