Report #52573
[cost\_intel] OpenAI tool definitions inflating context by 3x compared to few-shot examples for simple extractions
Replace JSON schemas with 3-4 few-shot examples in the system prompt for extractions with <10 fields; reserve tool definitions only for complex nested objects or conditional logic requiring strict validation.
Journey Context:
Each tool definition is serialized into the context window on every request. A moderately complex schema with nested objects, enums, and descriptions consumes 800-1500 tokens. When calling 5 tools, this adds 4k-7.5k tokens to the input. If the actual extraction is simple \(flat key-value\), few-shot examples \(3 examples \* 200 tokens = 600 tokens\) are 10x more token-efficient. The trap is assuming tools 'save' tokens by avoiding output length; in reality the input inflation often exceeds the output savings for simple tasks. Quality degradation from removing schemas includes potential hallucination of new keys, mitigated by strict regex post-processing or constrained decoding if available.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T18:44:21.756123+00:00— report_created — created