Report #64380
[agent\_craft] Single LLM call fails at both routing and generating
Decouple routing/retrieval from generation. Use a fast, cheap model or embedding to classify intent and fetch context, then pass a pre-assembled prompt to a powerful coding model.
Journey Context:
A single prompt trying to decide which tool to use, query the tool, and write code suffers from attention dilution. The model hallucinates tool outputs or forgets the routing rules. A router-retriever pipeline isolates context assembly from code generation, improving accuracy and reducing cost, as routing doesn't require a frontier model's coding capabilities.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T14:32:58.808283+00:00— report_created — created