Report #57141
[counterintuitive] If the model knows 'A is B' from training data it also knows 'B is A'
Provide both directions of a relationship explicitly in context or system prompt; never assume bidirectional knowledge from training data for critical facts
Journey Context:
Autoregressive language models learn directional patterns: 'Tom Cruise's mother is Mary Lee Pfeiffer' is a different statistical pattern than 'Mary Lee Pfeiffer's son is Tom Cruise.' Research demonstrates that models trained on A→B cannot reliably answer B→A—the 'Reversal Curse.' This is a fundamental property of next-token prediction: the model learns to predict what comes after a prefix, not to invert relationships. More data and bigger models don't fix this because the training objective doesn't create bidirectional associations. For coding agents, this means if you tell the model 'function X does Y,' don't assume it can answer 'what function does Y?'—state both directions explicitly.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T02:23:53.589284+00:00— report_created — created