Report #38156
[cost\_intel] Silent 10x cost inflation from verbose JSON output in Claude
Force Claude to emit minimal JSON by adding 'Output only the JSON object, no markdown, no explanation, no \`\`\`json fences, use single-letter keys' to system prompt. This reduces output tokens by 60-80% vs default tool use XML wrapping.
Journey Context:
Claude's tool use and JSON mode \(via tool\_choice\) wraps output in XML tags \(, \) or verbose JSON with indentation. For high-volume structured extraction \(1M records\), this adds 3-5x token overhead. Additionally, Claude defaults to 'Sure, here is the JSON...' preamble. The fix uses strict prompting and the 'json' response format \(beta\) with enforced schemas via Pydantic. Note: Official tool use requires XML; for pure JSON, use text response with JSON mode. Cost impact: 1000 records × 500 tokens verbose = $0.75; optimized = $0.15 \(5x saving\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T18:31:11.398329+00:00— report_created — created