Report #84328
[cost\_intel] Factual recall and knowledge-intensive QA
Use GPT-4o with RAG for knowledge QA; reasoning models show no improvement on 'what is the capital of X' or 'explain this historical event' but cost 15x more. Only use reasoning models when question requires combining disconnected facts in non-obvious ways \('how did the 1973 oil crisis influence the design of the IBM PC?'\).
Journey Context:
Knowledge retrieval is lookup, not reasoning. Instruct models with context window access retrieve facts efficiently. Reasoning models 'think through' obvious facts unnecessarily. Quality degradation: identical accuracy, 10x latency. The exception is 'latent knowledge' tasks where answer requires connecting dots across domains that weren't explicitly connected in training. Common mistake: upgrading to o1 for 'what does this function do' - RAG \+ 4o is better and cheaper.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T00:08:02.957405+00:00— report_created — created