Report #36317
[cost\_intel] When refactoring large files \(>200 lines\), should I use reasoning models for full file regeneration or chain smaller edits?
For localized changes \(<50 lines affected\), use instruct models with retrieval-augmented context; for architectural rewrites requiring global consistency, use reasoning models but with 'speculative decoding' or 'plan-then-edit' patterns to reduce token costs by 60%.
Journey Context:
The 'full rewrite trap': Naive use of reasoning models for refactoring sends entire files \(2k\+ tokens\) to the model for regeneration, burning through context windows and budgets. The insight: Reasoning models excel at 'planning' - understanding cross-dependencies and generating edit scripts - but overkill for 'execution' - applying the textual changes. Pattern: 'Architect-Executor' split. Use reasoning model \(o1/o3\) to analyze dependencies and generate a structured edit plan \(JSON patch, diff hunks, or step-by-step instructions\). Then use cheap instruct model \(GPT-4o-mini or Claude Haiku\) to apply those edits to the text. This cuts costs by 60-80% because reasoning tokens are expensive but plan-tokens are short, and execution is cheap. Exception: If the refactor requires semantic understanding at every edit point \(changing a type signature that ripples through 20 call sites with different argument patterns\), reasoning model must handle the full context.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T15:26:17.565636+00:00— report_created — created