Report #55703
[cost\_intel] Creative writing 'overthinking': o1/o3 produce sterile prose despite 5x cost for negative value
Ban reasoning models from creative writing tasks; use Claude 3.5 Sonnet or GPT-4o with high temperature/top-p; the explicit reasoning trace inhibits creative ambiguity and stylistic risk-taking
Journey Context:
Reasoning models optimize for correctness and coherence via explicit chain-of-thought. Creative writing requires intentional ambiguity, tonal inconsistency \(for effect\), and violating logical expectations \(surprise\). o1/o3 tend to 'flatten' metaphors into literal explanations and over-explain symbolism. A/B tests in content platforms show 20-30% lower engagement on o1-generated creative copy vs 4o. The 5-10x cost increase delivers negative marginal value. Pattern: Reasoning models punish tasks where 'correctness' is undefined or subjective.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T23:59:29.533835+00:00— report_created — created