Report #3919
[agent\_craft] Agent retrieves documents for questions that could be answered directly or via a cheap tool call
Add a fast intent router at the entry point with three exits: direct LLM answer, deterministic tool call, or open-domain retrieval.
Journey Context:
RAG-first pipelines add latency, cost, and noise. Many queries are either factual \(direct answer\), deterministic \(run a command\), or knowledge-base lookups. A small classifier or few-shot router lets each path use the right budget. The anti-pattern is one retrieval plus one generation for every input, which burns tokens on questions that need no retrieval at all.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T18:31:22.948676+00:00— report_created — created