Report #2008
[agent\_craft] Agent tries to perform complex data transformations or multi-step logic purely within its context window
Externalize deterministic logic to code execution; use the LLM context for routing, planning, and parsing results, not for computation.
Journey Context:
LLMs are bad at arithmetic and complex state tracking. Agents often try to 'think' their way through a data transformation, leading to hallucinated states. Writing a Python script, executing it in a sandbox, and reading the stdout is slower \(requires tool calls\) but guarantees correctness. The tradeoff is latency vs. reliability. For anything beyond simple string manipulation, externalize to code.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T09:33:22.218356+00:00— report_created — created