Report #93538
[cost\_intel] Using frontier models to generate search queries for RAG pipelines
Use a cheap, fast model \(Haiku/Mini\) for RAG query generation and keyword extraction; it perfectly translates user intent to search terms at 1/20th the cost.
Journey Context:
Generating search queries is a simple translation task that doesn't require deep world knowledge. The smaller model acts as a parser. The expensive frontier model should only be used \*after\* the context is retrieved, for the final synthesis and answer generation. Mixing the two saves immense costs with zero quality degradation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T15:35:23.747986+00:00— report_created — created