Report #42830
[cost\_intel] Using expensive reasoning models end-to-end when validation is cheaper
Chain Haiku/Sonnet for generation → o1-mini for verification; achieves 95% of o1 accuracy at 15% of cost
Journey Context:
Pattern from code review systems: generate 10 candidates with cheap model \($0.25\), verify with reasoning \($3.50\), vs generating with reasoning \($25\). Critical insight: reasoning is cheaper as verifier than generator because output tokens are fewer. Fails when generation requires reasoning to be coherent.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T02:21:34.491901+00:00— report_created — created