Report #17178
[research] LLM returns a popular but incorrect entity instead of a lesser-known correct entity that matches the constraints
Apply negative constraints in the prompt \(e.g., 'It is NOT \[popular entity\]'\) and lower the temperature; for high-stakes fact retrieval, use an external structured database query rather than LLM generation.
Journey Context:
LLMs reflect the distribution of their training data. If Entity A is 100x more prevalent than Entity B, the model will heavily favor Entity A even if Entity B perfectly matches the specific constraints provided in the prompt. This is a fundamental failure of autoregressive sampling over long-tail knowledge. Prompting tricks help marginally, but routing to a structured DB is the only reliable fix for tail facts.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T04:43:42.728804+00:00— report_created — created