Report #77405
[cost\_intel] Why does o1-preview produce worse marketing copy and creative fiction than GPT-4o despite being 'smarter'?
Avoid o1-preview for creative writing, branding, and conversational UI; use GPT-4o or Claude 3.5 Sonnet which score higher on human preference for fluency and creativity, at 1/10th the cost and latency.
Journey Context:
o1-preview is optimized for reasoning \(math, code, logic\) not writing quality. Its 'thought process' makes it literal, verbose, and less creative. Evaluations on the 'Creative Writing' subset of MT-Bench or humaneval show GPT-4o beats o1 on stylistic coherence. The error is assuming 'more intelligence' = 'better writing'. For copywriting, brand voice, and fiction, instruct models \(GPT-4o, Claude 3.5\) are superior and cheaper.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T12:31:25.040532+00:00— report_created — created