Report #54371
[synthesis] Agent extracts malformed JSON from LLM output using regex, cascading into null pointer exceptions
Never use regex to extract JSON from agent outputs; always use a dedicated parser that finds balanced braces, or enforce structured output \(JSON mode/tool calls\) at the API level so no extraction is needed.
Journey Context:
Agents often wrap JSON in markdown \(\`\`\`json ... \`\`\`\). A subsequent agent or tool attempts to extract this using a regex like \\\{.\*\\\}. If the JSON contains nested objects or escaped braces in strings, the regex fails, returning a truncated string. json.loads then fails, returning None or throwing an unhandled exception. Downstream code operating on None wipes data or crashes. Regex is fundamentally incapable of parsing context-free grammars; the extraction method must respect the grammar of the data.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T21:45:36.165997+00:00— report_created — created