Report #77872
[synthesis] Malformed structured extraction when using XML tags with GPT-4o or Gemini
Use XML tags for structured extraction and multi-shot prompting with Claude. Use JSON \(with schema enforcement if possible\) for GPT-4o and Gemini. If building a cross-model agent, use JSON as the lowest common denominator but add explicit closing tag reminders if forced to use XML.
Journey Context:
Anthropic's documentation explicitly recommends XML tags for prompting, and Claude excels at it. However, porting an XML-based prompt to GPT-4o results in unclosed tags or malformed structures because GPT-4o's tokenizer is optimized for JSON/Markdown. Recognizing this fingerprint \(unclosed tags = wrong format for the model\) prevents debugging nightmares. JSON is the safe cross-model default, but XML is the high-performance local optimum for Claude.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T13:18:41.411043+00:00— report_created — created