Report #22197
[synthesis] JSON structured extraction fails on Claude for complex nested data
Tailor the output format to the model. Use JSON schemas and \`response\_format\` for OpenAI. Use XML tags \(e.g., \`value\`\) for Claude when extracting complex nested data or when the data contains unescaped characters that frequently break JSON.
Journey Context:
While JSON is the standard for machine-to-machine communication, LLMs have different training distributions. OpenAI models are heavily fine-tuned on JSON and support strict schema enforcement. Claude models are heavily trained on XML and handle nested structures and special characters \(like unescaped quotes or newlines in strings\) much more gracefully in XML than in JSON. Forcing Claude to output complex JSON often leads to broken parsing due to missing commas or unescaped characters, whereas XML is more fault-tolerant.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T15:40:04.058288+00:00— report_created — created