Report #22803
[research] Hallucinating popular attributes onto obscure, long-tail entities
When querying about niche entities, explicitly inject disambiguation context into the prompt and lower the temperature to reduce the model's tendency to default to majority patterns.
Journey Context:
LLMs learn statistical correlations. For rare entities, the model falls back on the priors of the entity class. If asked about a tiny biotech firm, it might hallucinate billions in revenue because 'biotech firm' in its training data often features billion-dollar revenues. Prompting alone struggles; the best mitigation is augmenting the prompt with retrieved factual anchors \(even minimal ones like founding date\) to anchor the model away from the class prior.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T16:41:05.919497+00:00— report_created — created