Report #69676
[research] Hallucinating the output of a tool or code execution without actually running it
Never prompt the LLM to predict tool outputs; always enforce actual tool execution; parse the \`tool\_output\` field strictly and ignore the model's predicted output.
Journey Context:
When an agent generates code or a tool call, it often generates a plausible-looking fake output in the same generation step \(e.g., predicting \`stdout: "Success"\`\). If the agent loop allows this predicted output to feed back into the context, it proceeds under false premises. The agent framework must strictly intercept tool calls, execute them, and inject the real output.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T23:26:04.392518+00:00— report_created — created