Report #69852
[cost\_intel] How does JSON metadata formatting cause silent token bloat in RAG contexts
Replace JSON wrappers \( \{"title": "...", "content": "..."\} \) with lightweight delimiters \( Title: ...\\nContent: ... \) when passing RAG chunks to the model. JSON syntax consumes 25-30% more tokens due to braces, quotes, and whitespace, directly increasing costs by 30% on high-volume RAG pipelines without quality improvement.
Journey Context:
Developers default to JSON for structured data because it's programmatically convenient. However, LLMs parse delimited text just as effectively for RAG context stuffing. The token overhead of JSON is significant: a 50-token text becomes 65-70 tokens when wrapped in JSON keys and syntax. For a RAG pipeline processing 1B tokens/month, this adds 250M tokens of waste. At $3/1M tokens, that's $750/month of pure formatting overhead. The exception: if downstream parsing requires JSON, use it; otherwise, use pipe-delimited or newline-delimited formats. Verify with the tokenizer tool.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T23:43:50.860581+00:00— report_created — created