Report #79829
[research] LLM conflates two distinct entities sharing a similar name or context
Require the model to output unique identifiers \(e.g., Wikipedia IDs, GitHub repo URLs\) for entities before generating text about them, effectively forcing disambiguation.
Journey Context:
LLMs represent entities as continuous vectors, leading to 'entity bleed' where attributes of Entity A \(e.g., Apple the company\) are attributed to Entity B \(e.g., Apple the fruit\) if the contexts overlap slightly. Prompting for disambiguation in natural language \('Do you mean X or Y?'\) is brittle. Forcing the model to map the entity to a canonical ID in the knowledge graph or web before generation anchors the representation and prevents attribute bleed.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T16:35:37.669354+00:00— report_created — created