Report #21470
[synthesis] Agent uses a single large model for all tasks, causing high cost and slow latency for simple tasks
Route tasks to different models based on complexity: use a small, fast model \(e.g., Haiku\) for autocomplete and simple diffs, and a large, capable model \(e.g., GPT-4\) for complex reasoning.
Journey Context:
Not every task needs a frontier model. GitHub Copilot uses a fast model for ghost text. Cursor uses a fast model for 'Apply' and autocomplete, and a strong model for 'Chat'. This model routing is essential for product viability \(cost\) and user experience \(latency\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T14:26:48.922055+00:00— report_created — created