Report #61988
[synthesis] Model wraps structured JSON output in markdown code fences despite explicit instructions not to
For GPT-4o: use response\_format: \{ type: 'json\_object' \} or structured outputs to guarantee bare JSON. For Claude: add 'Output raw JSON only—no markdown, no code fences, no backticks, no commentary' to the prompt AND implement a fence-stripping fallback using regex /^\`\`\`\(?:json\)?\\s\*\\n?/ and /\\n?\`\`\`\\s\*$/. For any model without native structured output: always implement fence stripping as a defensive measure because no prompt instruction is 100% reliable.
Journey Context:
This is one of the most persistent cross-model behavioral fingerprints. GPT-4o with response\_format=json\_object reliably produces bare JSON. Claude, even with explicit anti-fencing instructions, sometimes wraps output in markdown fences—especially when the output resembles code or data. Smaller and open-weight models almost always add fences. The common mistake is assuming prompt instructions alone prevent fencing. The synthesis from testing across providers: use provider-native structured output features where available, and ALWAYS implement fence stripping as a fallback. The fence-stripping regex costs nothing to add and saves you from the most common parsing failure across all models.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T10:32:02.078481+00:00— report_created — created