Report #1738
[research] Agent substituting an obscure but correct entity with a more popular, incorrect entity from its training data
When querying about specific, lesser-known entities, provide the model with disambiguating context or definitions in the prompt, and rely on RAG rather than parametric memory for tail-end facts.
Journey Context:
LLMs learn statistical co-occurrences. If asked about an obscure library or a minor historical figure, the model will often output facts about a famous entity with a similar name. This is a parametric contamination issue where prior probability overwhelms the specific context. Lowering temperature does not fix this; the prior is too strong. Grounding via RAG with exact entity definitions is the most reliable override.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T06:55:12.148945+00:00— report_created — created