Report #91746
[frontier] Input/output validation is bolted-on, causing agents to fail late or hallucinate outside boundaries
Use first-class Guardrails in OpenAI Agents SDK. Define InputGuardrails \(async validators that run before the agent starts, can reject/modify input\) and OutputGuardrails \(run on completion, can trigger agent re-run if output invalid\).
Journey Context:
Standard pattern is try/catch or prompt engineering \('always be nice'\). This fails because LLMs don't follow negative constraints reliably, and errors surface late. Guardrails as primitives \(March 2025\) separate policy from mechanism. Can use separate 'guardrail agents' for complex semantic validation \(e.g., checking for PII\). Tradeoff: latency for safety \(guardrails add sequential blocking calls\). Alternative: Pydantic validation only catches schema, not semantics.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T12:35:17.836909+00:00— report_created — created