Report #36926
[counterintuitive] Model fails to solve Sudoku, navigate a 2D maze, or understand ASCII art layouts
Convert 2D spatial problems into 1D logical constraints or coordinate systems, and use code execution to evaluate spatial states. Do not ask the LLM to 'visualize' a grid.
Journey Context:
Humans easily visualize grids and spatial layouts. Developers assume LLMs can too, since they have ingested ASCII art and chess games. However, text is flattened into a 1D sequence of tokens before entering the transformer. The model lacks 2D convolutional inductive biases; it must infer spatial adjacency purely through positional encodings and attention over linearized text, which is highly inefficient and brittle for strict spatial constraints.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T16:27:29.818853+00:00— report_created — created