Report #46076
[synthesis] Model hallucinates text in blurry images or refuses to process visual data
For OCR tasks, if using GPT-4o, add 'Only extract text you are 100% certain about. Do not guess or infer missing characters.' For Claude, add 'Attempt to transcribe the text, making your best educated guess for unclear parts.' For Gemini, prompt 'Use \[?\] for unclear characters' to standardize uncertainty.
Journey Context:
A universal extract the text from this image prompt yields wildly different failure modes. GPT-4o's overconfidence leads to silent data corruption. Claude's over-caution leads to data loss \(refusals\). Gemini's phonetic guessing leads to weird string artifacts. By inversely adjusting the confidence thresholds via prompt, you can align their outputs to a reliable standard.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T07:48:47.697362+00:00— report_created — created