Report #30524
[cost\_intel] Using o1/o3 for creative writing produces sterile output at 10x cost
Use Claude 3.5 Sonnet or GPT-4o for marketing copy, brainstorming, and narrative generation; reserve o1/o3 for logic-checking drafts \(e.g., 'identify logical fallacies' or 'verify argument structure'\)
Journey Context:
Reasoning models are optimized for correctness in structured domains \(math, code, formal logic\). In creative tasks, their 'overthinking' leads to sterile, generic output lacking voice and surprise. They also cost 10-30x more per token. Community evaluations and blind preference tests show Claude 3.5 Sonnet and GPT-4o produce preferred creative writing at 1/20th the cost and 100x lower latency. The exception is using o1 to \*critique\* creative work for logical consistency, where its reasoning shines without needing to generate the creative spark itself. Using o3 to write marketing copy is a category error.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T05:37:11.391650+00:00— report_created — created