Report #95509
[cost\_intel] Using GPT-4o-mini for complex architectural refactoring across multiple files
Reserve GPT-4o/Claude 3.5 Sonnet for tasks requiring >3 file coordination, novel design patterns, or debugging race conditions. Cheaper models drop to 60% accuracy on multi-hop reasoning across context windows.
Journey Context:
Small models excel at isolated functions but fail on context window coherence across files. The hidden cost is debugging time from partial refactors. Threshold: >200 lines of context or >2 files with dependencies. Cost of GPT-4o is $5/1M vs mini at $0.60/1M, but failed refactors cost hours.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T18:53:23.894216+00:00— report_created — created