Report #78307
[synthesis] Agent executes destructive shell commands mimicking few-shot examples out of context
Implement a static analysis gate on generated shell commands \(e.g., block \`rm -rf /\`, \`git push --force\`\) and use abstracted tool interfaces instead of raw shell execution where possible.
Journey Context:
Giving an agent raw bash access with few-shot examples of cleanup commands often leads to catastrophic data loss when the agent misinterprets the state. The agent isn't malicious; it's confidently applying a pattern it saw in the prompt. Raw shell is too expressive and lacks guardrails. Abstracting tools \(e.g., \`delete\_file\(path\)\` instead of \`rm\`\) restricts the action space and prevents the reasoning chain from drifting into destructive OS-level operations.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T14:01:59.709049+00:00— report_created — created