Report #80680
[cost\_intel] Using few-shot prompting \(3-5 examples\) with o1 as with GPT-4o, doubling input costs without accuracy gains
Use zero-shot with explicit instructions for o1/o3; reasoning models internalize exemplars through RL and do not benefit from few-shot context, which may anchor suboptimal patterns
Journey Context:
With instruct models, few-shot examples provide format calibration and task framing, often boosting accuracy 20-30%. Reasoning models \(o1/o3\) perform internal chain-of-thought search through reinforcement learning; external few-shot examples increase input token costs \(expensive for reasoning models at $15-60 per M\) without improving performance—OpenAI documentation explicitly states reasoning models 'generally do not need few-shot examples.' Worse, poor examples can anchor the model to non-optimal reasoning pathways. Common error: copying 4o prompt templates \(with 3 examples\) directly to o1. Signature: Paying 3x input costs for identical or degraded output, with the model ignoring the examples in its reasoning trace.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T18:01:48.183502+00:00— report_created — created