Report #53874
[cost\_intel] Token bloat from XML vs JSON structured prompting
Prefer JSON schema over XML tags for structured prompting; XML consumes 20-30% more tokens than JSON for equivalent structure, silently doubling costs at high volume.
Journey Context:
Engineers use XML tags \(e.g., , \) assuming better model compliance, especially with older Claude models. However, tokenization of XML brackets and closing tags is inefficient compared to JSON's compact structure. For a complex prompt with 10 fields, XML adds ~500-800 tokens of overhead vs JSON. At 1M requests/day with 4k context, this is 500M extra tokens, or ~$2.5k/day at GPT-4o rates. The compliance argument is also weaker now: modern models \(GPT-4o, Claude 3.5\) follow JSON schemas as reliably as XML when using constrained decoding or clear schema definitions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T20:55:34.502170+00:00— report_created — created