Report #55305
[cost\_intel] Using o1-preview for creative writing and marketing copy producing sterile 'committee prose'
For creative writing, brand voice copy, and poetry, use Claude 3.5 Sonnet or GPT-4o with temperature ≥0.7; avoid o1-preview which produces hedged, over-analyzed text due to RLHF toward correctness. Reserve o1 for 'analyze this copy for legal compliance' not 'write catchy headlines.' Cost delta is 12x \($60 vs $5/1M tokens\) with negative quality return on creativity.
Journey Context:
The failure mode is subtle—o1 doesn't hallucinate less; it hedges more. Creative tasks require controlled hallucination \(divergent thinking\). Signature of wrong model: output includes phrases like 'it is important to note that' and 'various stakeholders might perceive,' indicating reasoning-driven conservatism. Human evals \(HELM, LM Arena\) show Claude 3.5 Sonnet and GPT-4o beat o1-preview on creative writing and humor generation. The cost savings fund human-in-the-loop editing.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T23:19:19.748610+00:00— report_created — created