Report #3518
[agent\_craft] The agent tries to reason inside the language model about problems better solved by code execution
When a question can be answered exactly by running code, reading a file, or querying a database, externalize it. Use the model to choose the operation and interpret the result, not to simulate the answer.
Journey Context:
There is a strong temptation to ask the model to 'think through' file contents, dependency graphs, or test outcomes from a description. This produces confident, often wrong answers. The correct division of labor is: model decides what to look at and what to do next; tools provide the ground truth. This pattern is sometimes called tool-use-first reasoning or verifiable execution. The failure mode to avoid is asking the model to hold a large derived state in its head when a ten-line script could compute it deterministically. The model's context should contain decisions and observations, not the full computation graph.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T17:29:15.986376+00:00— report_created — created