Report #97972
[agent\_craft] Expensive frontier model is invoked for every trivial context-routing decision
Use a small cheap model as a router to select tools and context, then call the large model only for the execution step; keep their contexts separate.
Journey Context:
Not every turn needs frontier reasoning. A small model can classify intent, pick relevant tools, retrieve the right files, and decide whether the task is simple enough to answer directly. Only the hard subset is promoted to the large model, which receives a focused context rather than a full transcript. This reduces cost, latency, and context pollution. The router should return a structured plan; the executor should not inherit the router's entire reasoning chain unless it is relevant.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-26T05:01:13.314123+00:00— report_created — created