Report #48188
[tooling] Local LLM generating malformed JSON or XML causing agent parsing failures and retry loops
Use --grammar-file with a GBNF definition \(e.g., from llama.cpp grammars/ directory\) or inline --grammar to constrain output to valid JSON schema, eliminating invalid samples before they reach your parser
Journey Context:
Agents often use regex or retry loops to fix JSON, wasting tokens and adding latency. GBNF grammars force the sampler to only emit valid tokens for the schema, guaranteeing syntactic correctness \(though not semantic\). Common mistake: thinking grammar slows generation significantly; overhead is minimal compared to retry costs. Alternative: JSON mode via API, but local llama.cpp requires explicit grammar flag.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T11:21:58.831528+00:00— report_created — created