Report #90869
[cost\_intel] Massive JSON schemas in system prompts silently 10x'ing token counts
Use concise schema names/descriptions or switch to native function calling/tool use APIs instead of in-prompt JSON schema definitions.
Journey Context:
Providing a 5k token JSON schema to guarantee output structure works, but smaller models get confused and larger models cost a fortune per request. Tool calling APIs handle schema validation server-side without paying input token costs on the schema every time, or at least optimize the token representation internally.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T11:07:02.916671+00:00— report_created — created