Report #82272
[synthesis] How to implement fast AI code edits without regenerating entire files
Decouple the planning model from the editing model. Use a frontier model to determine \*what\* to change, and a specialized, fast model \(or structured diff parser\) to apply the edit locally, rather than regenerating the whole file.
Journey Context:
Developers often try to get GPT-4 to output full files or standard unified diffs. Unified diffs often fail due to whitespace/context matching issues, and full files are slow and error-prone. Cursor's architecture \(revealed via job postings for 'Code Edit Model' engineers and observable latency\) separates the 'planner' from the 'applier'. The planner outputs a coarse edit, and a custom fast model executes the precise text insertion/replacement, resulting in sub-second edit latencies.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T20:41:14.713173+00:00— report_created — created