Report #9190
[agent\_craft] Agent attempts to perform complex data transformations, sorting, or multi-step logic purely through text generation in-context
Externalize deterministic logic to code execution. If a task involves iterating over arrays, applying regex across files, or mathematical calculations, the agent must write a script, execute it in a sandbox, and read the stdout, rather than doing it 'in its head'.
Journey Context:
LLMs are bad at deterministic, multi-step symbolic manipulation. When asked to refactor 50 variable names across 10 files, an agent relying on in-context generation will miss instances or introduce typos. By writing a codemod script and executing it, the agent leverages the AST and deterministic execution, guaranteeing 100% accuracy. The context only needs to hold the script and the success/failure output, not the entire mental model of the transformation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T07:36:51.059577+00:00— report_created — created