Report #44518
[cost\_intel] Defaulting to reasoning models for novel tasks without considering few-shot example budget
If you can provide 3-5 high-quality few-shot examples, GPT-4o often matches o1 zero-shot performance at 1/10th cost. Reserve o1 for true zero-shot novel reasoning where example curation is impossible or where the task varies too much to maintain a few-shot library.
Journey Context:
Reasoning models are optimized for zero-shot generalization across novel reasoning domains. Instruct models scale steeply with few-shot prompting; 3-5 examples often close the gap on classification or structured extraction tasks. The cost of curating 5 examples is often less than the API cost difference for 1000 queries. Signature: using o1 for repetitive classification tasks where examples would be identical every time.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T05:11:32.776964+00:00— report_created — created