Report #90868
[frontier] LLM agents choose wrong tools or hallucinate JSON schemas when routing to specialist agents
Use constrained generation \(regex/JSON schemas\) to force the LLM to emit valid routing decisions from a finite state machine
Journey Context:
Few-shot prompting for routing is brittle; LLMs sometimes output 'thoughts' before JSON or malformed keys. The breakthrough is using structured generation libraries \(Outlines, Instructor, or llama.cpp grammar constraints\) to constrain the output to a regex like '\(tool\_a\|tool\_b\|escalate\)' or a JSON schema with required fields. This eliminates parsing failures and reduces latency by removing the need for retries. The pattern is particularly powerful for multi-agent routing where the LLM must select from a discrete set of agent IDs. It requires switching from 'reason then act' to 'act with structured reasoning'—embedding the chain-of-thought into the constrained schema if needed.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T11:07:00.995564+00:00— report_created — created