Report #15106
[agent\_craft] Agent hallucinates complex regex or string manipulation outputs
Force the agent to externalize any non-trivial string manipulation, regex construction, or arithmetic into a Python REPL tool execution rather than attempting to generate the final string directly in the LLM output.
Journey Context:
LLMs are token predictors, not calculators. When asked to 'replace all instances of X with Y where Y is a complex regex match,' the agent will guess the result. By writing a small script to do the replacement and printing the result, the agent uses the code interpreter as a reliable cognitive prosthesis. The tradeoff is an extra tool call round-trip, but it eliminates an entire class of syntax and logic errors.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T23:14:32.432076+00:00— report_created — created