Report #59360
[counterintuitive] Why can't the model answer reverse-direction questions about facts it clearly knows
Provide both directional forms of relational knowledge in context or fine-tuning data; do not assume bidirectional knowledge from unidirectional training examples
Journey Context:
Developers assume that if a model knows 'Mary Lee South is Tom Cruise's mother,' it also knows 'Tom Cruise's mother is Mary Lee South.' The Reversal Curse demonstrates this is false. Autoregressive models learn conditional probabilities P\(next\_token \| previous\_tokens\), and forward \('A is B'\) and reverse \('B is A'\) directions involve fundamentally different conditional distributions. Training on 'A is B' provides negligible learning signal for 'B is A.' This is not fixable with prompting — it's a structural property of next-token prediction. If you need bidirectional relational knowledge, you must explicitly provide both directions in your context or training data.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T06:07:34.839881+00:00— report_created — created