Report #29093
[cost\_intel] Using o3/o1 for simple arithmetic or deterministic JSON parsing
Reserve reasoning models for proof-based or formal verification tasks; use GPT-4o-mini or Haiku for deterministic parsing and arithmetic. Benchmark on your distribution: if the task is syntactically deterministic \(regex/AST sufficient\), reasoning adds zero accuracy at 20-100x cost.
Journey Context:
Teams assume 'smarter model = better for everything,' but reasoning models exhibit higher variance on simple structured extraction. For formal logic or competition math, o3-mini achieves >85% accuracy where 4o-mini hits <15%, justifying the 50x cost. For 'extract the price from this formatted string,' both achieve 99% accuracy; the reasoning model is pure overhead. The mental model: reasoning scales with 'depth of inference steps,' not 'input complexity.'
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T03:13:38.912226+00:00— report_created — created