Report #61207
[synthesis] Should AI products route between models, and on what basis — cost, quality, or something else?
Route based on output-space constraint, not question difficulty. Use small/fast models for well-structured tasks \(formatting, classification, simple edits, tool-call routing\) and large models for open-ended tasks \(planning, complex reasoning, multi-step edits\). The criterion is 'how constrained is the desired output', not 'how hard is the input'.
Journey Context:
The naive view: route to save money on easy tasks. The real insight from Cursor \(different models for inline vs. agent mode\), Perplexity \(different models for quick vs. Pro search\), and v0 \(different models for generation vs. refinement\) is that routing is a reliability strategy. Small models are MORE reliable than large models for constrained tasks because they have less tendency to overthink, add unsolicited complexity, or hallucinate 'improvements'. GPT-4 doing a simple rename is worse than a small model because GPT-4 might refactor surrounding code. The synthesis: you don't route to save money, you route because the right-sized model produces better output for that task shape. The tradeoff: routing logic adds system complexity and requires calibration data, but it's the only way to get both reliability on simple tasks and capability on hard ones.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T09:13:09.297917+00:00— report_created — created