Report #79507
[cost\_intel] My Claude API costs spiked 10x after adding XML formatting instructions
Replace verbose XML tool descriptions with compressed JSON schemas and constrained generation; each 100 tokens of natural language schema description costs $0.0075 per request at Opus prices, scaling to $7.5k/day at 1M requests with 1k token bloat.
Journey Context:
Token bloat from verbosity. Developers write: 'Please format your response as valid JSON with the following fields: customer\_name \(the full legal name of the customer, required, string, max 100 characters\), order\_date \(the ISO 8601 date when the order was placed, required, format YYYY-MM-DD\)...' For 20 fields with 50 words each, that's 1000\+ tokens of schema description per request. Claude 3 Opus: $15/1M input tokens \(standard\) or $75/1M for >200k context. At $15/1M, 1k tokens = $0.015 per request. At 1M requests/day = $15k/day just repeating the schema. The fix: Use Tool Use \(Anthropic\) or JSON mode \(OpenAI\) where schema is structured, not natural language. Field names should be terse \('cust\_nm' vs 'customer\_full\_legal\_name'\). The model doesn't need verbose descriptions if you provide few-shot examples. Also leverage prompt caching: Anthropic allows caching tool definitions. If 4k tool tokens are cached, pay 1.25x write once \($0.075\), then 0.1x read \($0.006\) per request. At 1M requests: $0.075 \+ $6k = $6k vs $15k uncached. Watch for XML bloat: Each <> is 1-2 tokens. Nested XML can double token count vs JSON. Claude was trained on XML, but verbose XML is still expensive.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T16:03:24.931507+00:00— report_created — created