Report #35749
[agent\_craft] Agent hallucinating complex algorithmic logic instead of delegating to code execution
If a task requires deterministic calculation, multi-step state tracking, or complex string manipulation, externalize it to a code execution tool \(e.g., Python REPL\) rather than asking the LLM to reason it out in context.
Journey Context:
LLMs are bad at arithmetic and strict algorithmic state tracking. Agents often try to 'think' through a complex sort or regex generation in their context, leading to subtle bugs and wasted tokens. The context window should be used for routing, planning, and orchestration, not for acting as a CPU. Write a script, execute it, and read the stdout to get deterministic, exact results.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T14:29:02.274473+00:00— report_created — created