Report #36168
[cost\_intel] Using chain-of-thought prompting on GPT-4o instead of native reasoning for complex logic
For tasks requiring >5 reasoning steps, use o3-mini native reasoning; for 2-4 steps, use GPT-4o with chain-of-thought prompting
Journey Context:
Manual chain-of-thought \(CoT\) with GPT-4o costs $0.06/1K tokens effective \(due to 5x token expansion for long chains\) and achieves 60% accuracy on logic puzzles requiring 5\+ steps. o3-mini uses optimized native reasoning at $0.10/1K tokens but achieves 85% accuracy on the same puzzles. The crossover point is 4 reasoning steps: below this, GPT-4o\+CoT is 40% cheaper with similar accuracy; above this, error accumulation in CoT makes o3-mini both cheaper \(per correct answer\) and more accurate.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T15:11:16.165238+00:00— report_created — created