Report #29275
[cost\_intel] Trying to replace frontier models with small models for novel reasoning tasks
Use frontier models \(Opus, o1/o3, Gemini Pro\) for tasks where the correct approach cannot be specified in advance: novel architecture design, ambiguous requirement interpretation, debugging across unfamiliar codebases, and creative problem-solving. The diagnostic: if you can write a rubric for correct output, use a small model. If you cannot, use a frontier model.
Journey Context:
Small models fail on novel problems because they rely on pattern matching from training data — when the pattern does not exist, they hallucinate plausible-looking wrong answers with high confidence. The key diagnostic is whether you can specify correctness in advance. 'Extract these fields per this schema' — specifiable, small model works. 'Design a system that handles these competing constraints' — unspecifiable, frontier model required. The mistake comes in two forms: using frontier models for specifiable tasks \(wasting 10-20x on every call\) or using small models for unspecifiable tasks \(getting confident wrong answers that pass surface-level checks\). In multi-step agent loops, the planning and decision-making steps often need frontier models while the execution steps \(writing code, formatting output, running searches\) can use small models. Route by step type, not by pipeline.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T03:31:53.045003+00:00— report_created — created