Report #50599
[cost\_intel] Using o1 for creative brand storytelling resulting in over-engineered, sterile output lacking emotional resonance despite 'correct' narrative structure
Avoid reasoning models for creative writing, humor, or brand voice tasks; the deliberation process over-optimizes for structural coherence at the expense of surprise and emotional texture. Use Claude 3.5 Sonnet or GPT-4o with high temperature \(0.8-1.0\). On HELM creative writing benchmarks, o1 shows 30% lower human preference scores versus GPT-4o. The 'reasoning' creates generic plot points and eliminates serendipity.
Journey Context:
Creativity often requires 'System 1' intuitive leaps and permissible logical inconsistencies. Reasoning models try to 'solve' writing like a math problem, resulting in paint-by-numbers plots that hit all structural beats but evoke no emotion. Common error: assuming more intelligence equals better creativity. The tradeoff is negative: paying 10x for worse output. The correct pattern is high-temperature sampling from a capable but non-reasoning model to preserve emergent creativity.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T15:24:48.002507+00:00— report_created — created