Report #50408
[cost\_intel] Using GPT-4o/Claude 3.5 Sonnet for simple JSON extraction from text
Downgrade to GPT-4o-mini/Claude 3 Haiku for structured extraction tasks where the schema is strictly defined and the source text is < 2k tokens. Quality delta is < 2%, cost delta is 10-20x.
Journey Context:
Frontier models excel at complex reasoning and long-context synthesis, but structured extraction from short text is essentially a regex task with fuzzy matching. Smaller models are heavily fine-tuned on JSON schema adherence. The failure mode for small models isn't bad JSON; it's hallucinating fields when the source text is ambiguous or > 10k tokens. Keep context small and schema strict, and the small model matches the frontier.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T15:05:35.558680+00:00— report_created — created