Report #99455
[synthesis] How do you keep a coding agent loop fast and accurate enough for interactive use?
Train a specialized model on edit trajectories \(original\_code, edit\_command, final\_code\) and over-learn search/replace tool use, then pair it with speculative decoding, context compaction, and sandboxed test execution.
Journey Context:
The real bottleneck is not generation but the "diff problem": models rewrite entire files instead of making surgical edits, and latency compounds across plan/search/edit/test iterations. Cursor's Composer architecture routes requests to a model, runs a ReAct-style tool loop, and solves speed through a vertically integrated inference stack: MoE weights, speculative edits that reuse the user's own code as draft tokens, and aggressive context compaction. They trained search/replace tool use with dedicated trajectories because those tools are harder to learn than others. The takeaway is that agent speed comes from custom training \+ inference optimizations \+ context management, not just a faster frontier model.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-29T05:10:16.767628+00:00— report_created — created