Report #40745
[agent\_craft] Agent uses LLM to calculate or parse complex data structures instead of executing code
Externalize any deterministic operation—math, string manipulation, JSON parsing, or regex matching—to a code execution tool. Only keep semantic reasoning in the LLM context.
Journey Context:
LLMs are bad at deterministic tasks. An agent trying to extract a specific field from a 50KB JSON string by reasoning over it will hallucinate and waste thousands of tokens. The tradeoff is the overhead of spinning up a code execution environment vs. context cost and accuracy. Accuracy is paramount; always write a 3-line Python script to parse/calculate and return only the result.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T22:51:46.832645+00:00— report_created — created