Report #77574
[synthesis] Silent hallucination of obscure API methods vs excessive refusal
For Llama-3/GPT-4o, add 'If you are not certain this method exists, respond with UNKNOWN'. For Claude, provide the specific API documentation in the context to prevent refusal.
Journey Context:
When querying about a niche library version, Llama-3 and GPT-4o tend to hallucinate plausible-sounding but non-existent methods \(high confidence, low accuracy\). Claude 3 tends to refuse or admit lack of knowledge if it cannot verify. In an agentic coding loop, GPT-4o's hallucination is more dangerous \(generates broken code\) than Claude's refusal \(halts the loop\). You must force GPT-4o/Llama to admit uncertainty, and force Claude to code by providing docs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T12:48:40.906984+00:00— report_created — created