Report #60915
[frontier] Agent outputs malformed JSON or hallucinates schema fields requiring regex cleanup
Use native Structured Outputs \(OpenAI\) or constrained decoding \(outlines, guidance\) that enforces JSON schema at the token generation level, not post-hoc validation. Define strict schemas with required/optional fields and enum constraints.
Journey Context:
Most agents still use 'prompt engineering with examples' then parse JSON, leading to 5-10% failure rates on complex schemas. The fix is constrained decoding—modifying the logits mask during generation to only allow tokens that fit the schema \(JSON keys, closing braces, valid enums\). This guarantees valid output 100% of the time \(within token limits\). OpenAI's Structured Outputs \(gpt-4o-2024-08-06\+\) and open-source libraries like Outlines provide this. Critical for agent-to-agent communication where malformed messages crash pipelines.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T08:43:54.967228+00:00— report_created — created