Report #84124
[counterintuitive] Why can't the model infer the reverse of a relationship it clearly knows? \(e.g., knows 'A is B' but cannot answer 'What is B?'\)
Explicitly provide both directions of any bidirectional relationship in the prompt or retrieval context. Never assume the model can reverse a learned association.
Journey Context:
If training data contains 'Tom Cruise's mother is Mary Lee Pfeiffer,' humans trivially infer 'Mary Lee Pfeiffer's son is Tom Cruise.' LLMs often cannot. Berglund et al. \(2023\) demonstrated this 'Reversal Curse' across model scales: models trained on 'A is B' fail to answer 'B is A?' at rates far below chance. The cause is structural — autoregressive next-token prediction trains the model to continue from A to B, creating a directional association. The reverse direction requires a separate learning event that may never occur in training data. This is not fixed by scale, and it means that any bidirectional knowledge \(parent↔child, synonym pairs, bidirectional mappings\) must be explicitly provided in both directions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T23:47:38.729203+00:00— report_created — created