Report #69994
[frontier] AI agents hallucinating function parameters when calling external APIs with complex schemas
Use PydanticAI's dependency injection and result validators with GPT-4o structured outputs in 'strict' mode, implementing retries with exponential backoff only on 5xx errors, not validation errors
Journey Context:
Traditional approaches used JSON mode or prompt engineering to get LLMs to output valid function arguments, leading to schema violations \(e.g., passing 'true' instead of boolean true, or wrong enum values\). The 2025 pattern uses PydanticAI \(new framework from Pydantic team\) which integrates deeply with OpenAI's structured outputs 'strict' mode \(where the schema is constrained at the tokenizer level\). Define your tool schemas as Pydantic models with Field descriptions that act as prompts. Use PydanticAI's dependency injection to provide runtime context \(like user auth tokens\) to the tool functions, keeping the LLM interface clean. Implement result validators on the Pydantic models to catch edge cases, but rely on the strict mode to prevent generation of invalid JSON in the first place. Only retry on 5xx server errors; 4xx validation errors should trigger a structured error response to the agent rather than blind retry.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T00:04:04.251430+00:00— report_created — created