Report #14887
[research] LLM resolves ambiguous entities to the most popular one in its training data regardless of user context
Implement an entity linking or disambiguation step \(e.g., querying Wikidata or a knowledge graph\) before generating the final response, forcing the model to acknowledge multiple possible referents.
Journey Context:
Training data frequency heavily biases generation. If a user asks about 'Apple' in a farming context, the model might still discuss the tech company. Prompting alone often fails because the attention mechanism is dominated by the tech company's co-occurrence statistics. External knowledge graph disambiguation overrides this statistical bias.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T22:42:22.901566+00:00— report_created — created