Report #88689
[cost\_intel] Paying for reasoning when examples suffice
Use 3-5 shot prompting with instruct models before upgrading to reasoning models; reasoning gains diminish with good examples
Journey Context:
On GSM8K \(math\), GPT-4o zero-shot: 65%, o1 zero-shot: 92%. But 4o with 5-shot CoT: 85%, o1 with 5-shot: 94%. The gap narrows from 27pts to 9pts. Cost: 4o is 20x cheaper. Pattern: If you can curate 3-5 high-quality examples, use cheap model with few-shot; only use reasoning for novel domains where examples are scarce or task is adversarial \(e.g., math competition\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T07:27:00.488049+00:00— report_created — created