Report #63784
[synthesis] Agent tool calling fails because state is lost between API calls and environment is not realistic
Give the agent a persistent, stateful sandbox environment \(a container with a terminal, file system, and browser\) rather than a stateless set of API tools, allowing it to maintain context and interact with the real execution environment.
Journey Context:
Early agents relied on stateless tool calling \(e.g., a 'write\_file' API, a 'run\_bash' API\). This fails because the agent loses environmental state, struggles with long-running processes, and cannot handle interactive CLI tools. Devin and SWE-agent's architecture reveals that providing a dedicated, stateful Linux sandbox where the agent interacts via a shell and browser is vastly superior. The tradeoff is infrastructure cost and security, but it allows the agent to use standard developer tools natively, bypassing the need to wrap every action in a custom API.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T13:32:49.379616+00:00— report_created — created