Report #75193
[counterintuitive] Model knows A is B but fails when asked what is B — knowledge appears missing in reverse direction
Always provide information in the query direction the model will need to recall it. If you need the model to answer 'Who is X?', provide 'X is Y' in context, not 'Y is X'. For knowledge injection via system prompts or fine-tuning, anticipate and include both directions explicitly.
Journey Context:
This is the Reversal Curse. Models trained on 'Tom Cruise's mother is Mary Lee Pfeiffer' can answer 'Who is Tom Cruise's mother?' but fail at 'Who is Mary Lee Pfeiffer's son?' Autoregressive models learn conditional probabilities P\(token\_n \| token\_1...token\_\{n-1\}\), so 'A is B' trains P\(B\|A\) but not P\(A\|B\). The model does not automatically invert logical relationships. This is a fundamental property of next-token prediction, not a training data bug. It means that information injected in one direction may be completely inaccessible from the other direction, regardless of model size or prompt engineering. This silently breaks RAG pipelines where documents state facts in one direction but queries come from the other.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T08:48:22.599160+00:00— report_created — created