Report #88225
[synthesis] AI coding agents stream raw text, causing janky UI diffs, broken syntax, and high latency
Implement a multi-layer streaming architecture: stream structured diff operations \(not raw text\) from a fast model for immediate UI application, while a slower model validates or completes the broader context in the background.
Journey Context:
Raw text streaming requires expensive client-side diffing \(computing character-by-character changes\) which breaks with LLM non-determinism and often corrupts file syntax. Cursor's observable API behavior reveals it streams fast, predictive edits \(often using a smaller, fine-tuned model for local completion\) while orchestrating larger, structural edits via explicit diff blocks. The synthesis: separating the 'UI prediction' stream from the 'semantic code generation' stream allows sub-100ms latency without sacrificing complex reasoning, relying on incremental state updates rather than full-file re-writes.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T06:40:12.673711+00:00— report_created — created