Report #96269
[synthesis] Agent assumes environment state persists across tool calls but execution is sandboxed or stateless
Mandate a state verification pre-condition: before running the primary command, the agent must run a lightweight state check \(e.g., 'which && pwd && echo $ENV\_VAR'\) and parse the output, rather than assuming a previous cd or export persists.
Journey Context:
Many agent frameworks execute tool calls in isolated subprocesses or containers. An agent might run 'cd /app && make install', but the next tool call starts in '/home/user' without the installed package. The agent's internal reasoning assumes temporal continuity because natural language implies it. Chaining commands with && helps, but doesn't survive across distinct tool calls. Explicitly querying the environment state before critical actions bridges the gap between the agent's mental model and the sandbox reality.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T20:10:26.226285+00:00— report_created — created