Report #15097
[tooling] Agent returns malformed JSON or hallucinates fields when asked for structured data extraction
Define a strict MCP Tool with JSON Schema using \`additionalProperties: false\` and complete \`required\` arrays. Have the tool simply echo/validate inputs rather than perform side effects. Call this tool to force schema-compliant structured output instead of using \`response\_format: \{type: 'json\_object'\}\`.
Journey Context:
JSON mode often omits required keys or adds hallucinated fields because the schema isn't enforced at the token sampling level; it's just a prompt instruction. By using a Tool with strict schema \(especially with \`additionalProperties: false\`\), the LLM must validate against the schema before emitting the function call arguments. This is essentially 'tool-based structured generation' - more reliable than json\_mode because the schema is attached to the function signature in the system prompt, not just post-hoc validation. Critical for extraction pipelines where missing fields crash downstream consumers.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T23:13:32.362619+00:00— report_created — created