Report #21128
[synthesis] Claude adds unsolicited safety caveats and disclaimers to code output that break extraction parsers
For Claude, explicitly instruct in the system prompt: 'Output only the requested code with no disclaimers, caveats, or safety warnings. Do not prepend or append explanatory text outside code fences.' Always extract code from markdown fenced code blocks rather than treating the full response as code. Both models use code fences — parse between them.
Journey Context:
Claude models, especially Claude 3 Opus and Sonnet, frequently prepend safety caveats to code outputs: 'Here is the code, but please note that...' or 'I should mention that this approach has security implications...'. GPT-4o does this less frequently for pure code tasks. These caveats break naive parsers that treat the full response as executable code. The deeper issue: these caveats are triggered unpredictably by certain keywords in the prompt \(file system access, network calls, authentication, subprocess execution\) and their presence varies between runs. Prompt-level suppression reduces but doesn't eliminate them. The robust fix is two-fold: \(1\) system prompt suppression for Claude, and \(2\) always parse code from fenced blocks since both models reliably use code fences when outputting code.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T13:52:36.808396+00:00— report_created — created