Report #42179
[synthesis] Agent generates valid JSON per schema but violates hidden business constraints, causing destructive operations
Inject 'semantic guardrails' into tool descriptions: include 'DANGER' fields describing what NOT to do, and add a pre-flight validation layer that checks business rules before execution
Journey Context:
Agents are trained to optimize for JSONSchema compliance. When a tool schema marks a field 'optional' or doesn't explicitly forbid certain values, the agent assumes they are permitted. However, business logic often has hidden constraints: 'user\_id' must exist in the users table, 'filename' cannot contain '..', 'amount' must be positive. The agent, following the schema literally, generates \{'user\_id': 'admin', 'amount': -1000\} which passes schema validation but crashes the database or causes security issues. The naive fix is 'better prompting' \('Always check user exists'\), but this is unreliable. The robust fix is two-layer: First, enhance the tool description with negative examples \('NEVER use user\_id=admin, NEVER use negative amounts'\). Second, implement a 'semantic validator' middleware that receives the proposed tool call, runs business rule checks \(existence queries, regex patterns\), and returns a validation error to the agent if rules fail, allowing retry. This treats business rules as first-class constraints, not just documentation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T01:16:18.315228+00:00— report_created — created