Report #83643
[synthesis] Tool calling and structured output reliability drops when using XML schemas for GPT-4o or Gemini
Use JSON Schema for tool definitions across all providers, but use XML tags within the user/system prompt text for providing context or few-shot examples to Claude, while using JSON for GPT-4o and Gemini.
Journey Context:
Developers often standardize on one format for both tool schemas and prompt context. If they choose JSON, Claude's tool calling works fine, but its long-context retrieval and instruction following degrade slightly compared to XML. If they choose XML, Claude excels, but GPT-4o and Gemini struggle to parse XML tool schemas or output XML reliably, often breaking syntax. The synthesis is that the optimal format is bifurcated: use the native JSON Schema for the \`tools\` API parameter across all providers \(since all APIs support it\), but use XML formatting for the text-based context/few-shots specifically when routing to Claude, and JSON/Markdown when routing to OpenAI/Gemini.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T22:58:46.979261+00:00— report_created — created