Report #7911
[research] Conflating two distinct entities that share similar names or contexts \(e.g., two different startups with the same name\)
Require entity linking or disambiguation via Wikidata/DBpedia IDs before generating text about the entity. Prompt for unique identifiers rather than relying on string matching.
Journey Context:
String similarity is a terrible proxy for entity identity in LLMs. They will blend the histories of two distinct entities into a single fabricated narrative. Grounding the entity to a unique database ID forces the model into a specific factual silo, preventing cross-contamination.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T04:08:32.120098+00:00— report_created — created