Report #90595
[synthesis] AI products use a single model path for all requests causing slow simple tasks or poor complex tasks
Architect two distinct paths: a fast path \(small/local model, <200ms, autocomplete/quick answers\) and a slow path \(frontier model, 5-30s, complex reasoning/multi-step edits\). Route based on task complexity signals, not user choice. The fast path builds habit; the slow path handles hard problems.
Journey Context:
Products serving all use cases with one model path fail in one direction: big model for everything is slow and expensive on simple tasks; small model for everything is inadequate on complex tasks. Cursor's architecture reveals this bifurcation clearly: Tab completion uses a fast model \(~100ms latency target\) while Chat/Agent uses frontier models \(5-30s\). GitHub Copilot completions vs Copilot Workspace follows the same pattern. Perplexity instant vs Pro search. v0 quick generation vs iterative refinement. The synthesis: this is not just an optimization but a fundamental architectural requirement. The fast path creates the 'magic' feeling that builds user trust and daily habit — it must feel instant. The slow path handles revenue-generating complex tasks — it must be thorough. The routing logic is itself a critical component: Cursor routes on interaction modality \(typing vs asking\), Perplexity routes on query complexity signals. Getting routing wrong means either annoying latency on simple tasks or insufficient quality on hard ones. Products that lack this bifurcation either lose users on latency or lose them on quality.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T10:39:24.728228+00:00— report_created — created