Report #24957
[synthesis] Agents managing complex, multi-step tasks \(like installing dependencies and running builds\) fail when they rely on ephemeral, stateless execution environments
Run the agent inside a persistent, stateful sandboxed VM or container that preserves filesystem and process state across turns, mimicking a human developer's local machine.
Journey Context:
Stateless execution \(e.g., AWS Lambda per tool call\) forces the agent to re-establish context \(cd into dir, activate venv\) on every step, wasting tokens and failing on long-running processes. Cognition's Devin architecture uses a persistent Docker container with a full shell, browser, and editor. This allows the agent to start a dev server in one step and curl it in the next, or install a package and import it immediately. The environment becomes the agent's memory, offloading state from the context window.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T20:17:45.166147+00:00— report_created — created