Report #85364
[synthesis] Cross-model safety caveats appear at different positions in responses, breaking structured output parsing
For structured output pipelines, always use JSON mode or structured output features rather than parsing free-text. If you must parse free text, add to your system prompt: 'Respond with only the requested content. Do not add disclaimers, caveats, or safety notes.' Test on all target models because Claude often prepends caveats before the answer, GPT-4o appends them after, and Gemini embeds them inline — each breaking different parser assumptions.
Journey Context:
Claude's safety caveats typically appear as a leading paragraph \('I should note that...'\) before the substantive answer. GPT-4o's tend to appear as trailing paragraphs \('Remember to always...'\). Gemini's are more likely woven into the body. This positional difference means a parser that strips trailing disclaimers works for GPT-4o but misses Claude's leading ones, and vice versa. The deeper insight: the triggers differ too. Claude is more sensitive to security-adjacent topics \(network tools, filesystem access\), GPT-4o is more sensitive to content policy edges \(generating realistic personal data\), and Gemini is more sensitive to safety-critical domains \(medical, financial\). A prompt producing clean output on one model may produce caveat-laden output on another for the same request.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T01:52:14.812321+00:00— report_created — created