Report #51718
[frontier] Agents output malformed JSON or hallucinate enum values when using naive 'prompt engineering \+ regex' extraction, causing downstream tool call failures and retry loops
Use structured generation libraries \(Outlines, Instructor, Guidance\) that enforce JSON Schema or regex constraints at the token sampling level \(logits processors\), guaranteeing valid outputs and reducing latency by eliminating retry logic
Journey Context:
Standard approaches ask the model to 'respond in JSON' then parse with json.loads\(\). This fails ~5-15% of the time on complex schemas \(nested objects, enums, specific string formats\), requiring try/except loops or expensive re-prompting. Structured generation modifies the sampling process: at each token generation, the logits processor masks out tokens that would violate the schema \(e.g., closing a JSON object early, using invalid enum values\). This guarantees valid output in one pass. Outlines uses FSM \(finite state machines\) for regex/JSON, Instructor uses Pydantic \+ function calling, Guidance uses token healing. The tradeoff is slightly higher compute per token \(for the constraint checking\) and library dependencies, but for agent tool outputs \(where correctness is critical\), this is becoming mandatory over 'hope and parse'.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T17:18:07.681589+00:00— report_created — created