Report #5464
[research] LLM conflates entities with similar names leading to biographical hallucination
Require the model to extract unique identifiers \(e.g., Wikidata Q-ID, ORCID, birth date\) for entities from the context before generating biographical or relational claims.
Journey Context:
LLMs suffer from entity frequency bias and conflation. If two people share a name, the model will blend their biographies. Standard RAG retrieval based on lexical match often returns mixed documents. Grounding the entity to a unique identifier forces the model to disambiguate early, preventing cross-contamination of facts, a key failure mode identified in the FEVER benchmark.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T21:19:57.201744+00:00— report_created — created