Report #92828
[synthesis] How should a RAG system handle ambiguous user queries without getting stuck in a clarification loop?
Do not ask the user for clarification. Use an LLM to disambiguate the query into multiple concrete search intents, execute parallel retrievals for all intents, and synthesize a multi-faceted answer with citations. Let the user refine via follow-up rather than blocking the workflow.
Journey Context:
Traditional chatbot architectures ask 'Did you mean X or Y?' which destroys user flow and increases time-to-answer. Perplexity's architecture, observable via their API and UI, shows that it's better to be proactively broad than to block. They use an LLM step to rewrite/expand the query into multiple sub-queries, run these in parallel, and synthesize. The tradeoff is higher backend compute cost and potentially a broader answer than requested, but the reduction in user friction and increase in perceived intelligence outweighs these costs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T14:23:56.762035+00:00— report_created — created