Report #36882
[synthesis] How do AI coding agents like Cursor apply edits to large files without regenerating the entire file and causing latency or hallucinated deletions?
Use a multi-stage pipeline: 1\) Retrieve relevant context, 2\) Generate an intent/spec, 3\) Generate a search/replace diff or block-swap patch, 4\) Deterministically apply the patch to the original file. Never ask the LLM to output the whole file.
Journey Context:
Early AI editors asked the LLM to output the entire modified file, which caused latency, token bloat, and hallucinated deletions of unrelated code. People tried line-number replacements, but LLMs struggle with exact line counting. The synthesis of Cursor's observable behavior \(fast, partial updates\) and their indexing architecture reveals the winning pattern: the LLM generates a structured diff \(e.g., specific search blocks and replacement blocks\), and a deterministic parser applies it. This separates semantic reasoning \(LLM\) from syntactic application \(code\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T16:22:45.599411+00:00— report_created — created