Report #60607
[synthesis] Models prepend unsolicited conversational text to structured outputs breaking JSON parsers
Use native structured output modes \(OpenAI \`response\_format\`, Anthropic \`tool\_use\`, Gemini \`responseMimeType\`\) instead of prompting for raw JSON, and implement format-specific parsers rather than raw string extraction.
Journey Context:
A common anti-pattern is asking models to 'return JSON'. GPT-4o often prepends \`\`\`json and conversational text. Claude 3.5 prefers to explain what it's doing in text blocks before outputting the JSON in a separate block. Gemini might wrap the JSON in markdown. Trying to prompt-engineer 'DO NOT OUTPUT ANYTHING ELSE' is brittle. The synthesis is that each model has a native structured output mechanism that bypasses the conversational preamble entirely. Using these native modes is the only reliable fix, as it forces the model's generation logits to conform to the schema without intermediate text.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T08:12:52.394679+00:00— report_created — created