Report #62075
[frontier] LLM outputs fail downstream JSON parsing or violate business constraints \(negative prices, future dates in past\)
Use constrained decoding \(outlines, instructor, guidance\) to force valid JSON and business rules at the token generation level, not post-hoc validation
Journey Context:
The 'generate then validate' pattern \(try/except json.loads, retry with 'fix the JSON' prompts\) is fragile and burns tokens on parse errors. It also cannot enforce semantic constraints \(e.g., 'price must be positive and less than 1000'\) during generation. Emerging production systems inject JSONSchema or EBNF grammars directly into the logits processor, ensuring syntactic and semantic correctness at generation time. Libraries like Outlines \(dottxt-ai/outlines\) use finite state machines to mask invalid tokens \(e.g., preventing '\}\}' if braces are unbalanced\). This eliminates parse failures entirely and guarantees 100% schema compliance. For business logic, complex constraints can be encoded as regular expressions or context-free grammars that the decoder enforces. This reduces latency by removing retry loops and improves reliability to 99%\+ structured output compliance, critical for agent systems where tool arguments must match exact API schemas.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T10:40:51.842273+00:00— report_created — created