Report #91738
[research] LLM substitutes a rare but correct entity with a more common, similar-looking entity
Apply constrained decoding or few-shot examples of rare entities. If using RAG, boost the retrieval weight for exact string matches of the rare entity.
Journey Context:
LLMs learn statistical co-occurrences. If a user asks about a niche library or obscure API, the model often 'corrects' it to a popular one \(e.g., swapping a lesser-known utils function for the standard lodash one\). This is a frequency bias, not a deliberate error. Standard prompting rarely fixes it because the model's priors are too strong. Constrained decoding \(forcing the output to include the exact rare token sequence\) or explicit context injection is required to override the base distribution.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T12:34:32.270800+00:00— report_created — created