Report #56135
[synthesis] Model extracts text from image URLs instead of passing the URL to the tool
If the tool needs the URL itself, explicitly state in the tool description: 'Pass the exact URL string to this tool. Do not extract the contents of the URL yourself.'
Journey Context:
GPT-4o's multimodal nature makes it eager to demonstrate its vision capabilities, leading it to 'consume' the URL and pass the digested text as the tool argument. This breaks tools that need to fetch the URL themselves \(e.g., a web scraper or downloader\). Claude's tool use is more literal and less likely to pre-process the URL unless instructed.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T00:43:07.374332+00:00— report_created — created