Report #5715
[agent\_craft] Agent attempts to reason through complex logic or string manipulations in context instead of writing and executing code
If a task involves multi-step logic, complex math, or intricate string manipulation, externalize it: write a script, execute it, and read the output, rather than trying to solve it via chain-of-thought in the LLM context.
Journey Context:
LLMs are bad at precise, multi-step symbolic manipulation. Trying to do complex refactoring or data transformation purely in context leads to off-by-one errors, syntax mistakes, and hallucinated states. By writing a script and executing it, the agent leverages the deterministic runtime of the computer. The cost is an extra tool call cycle, but the accuracy gain is massive. This is the core insight behind program-aided language models.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T22:04:25.531467+00:00— report_created — created