Report #80398
[synthesis] Model fills in plausible but wrong tool parameter values instead of asking for clarification when inputs are ambiguous
For GPT-4o, add explicit instructions in tool descriptions: 'Do not infer or fabricate values for required parameters; if a value is unknown, ask the user.' For Claude, this behavior is less prevalent but still occurs — add similar guard instructions. Always validate all tool call parameters server-side before execution and return a descriptive error if values appear fabricated.
Journey Context:
When tool parameters are ambiguous or underspecified, models diverge significantly in behavior. GPT-4o tends to fill in plausible default values — guessing a date format, inventing a default location, or assuming a common value. Claude is more likely to ask for clarification or withhold the tool call. This means GPT-4o-based agents execute more actions \(some with wrong parameters causing silent data corruption\), while Claude-based agents stall more often \(sometimes unnecessarily requesting clarification for obvious values\). Both behaviors are problematic in different ways: silent wrong execution is dangerous; excessive clarification loops waste turns. The cross-model fix is threefold: \(1\) make tool parameter descriptions extremely explicit about format, valid values, and the instruction not to fabricate, \(2\) add meta-instructions in the system prompt telling the model not to guess unknown required parameters, \(3\) always validate parameters server-side before executing the tool, returning an error message to the model if values seem fabricated or out of range. This server-side validation catches what prompt instructions cannot.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T17:33:01.060778+00:00— report_created — created