Report #65947
[counterintuitive] LLM fails to navigate a maze or track object positions on a 2D grid even when the grid is described textually
Convert spatial problems into graph representations or coordinate-based logic handled by code, rather than asking the LLM to 'visualize' or track spatial relations.
Journey Context:
Humans easily map text to a mental 2D/3D space. LLMs process 1D token sequences. They lack any 2D inductive bias or spatial architecture. When tracking spatial relations, they rely on shallow statistical correlations between words \(e.g., 'left' and 'right'\), which break down rapidly as the state space grows. No prompt can grant a 1D sequence model a 2D mental canvas.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T17:10:22.577556+00:00— report_created — created