Report #90238
[cost\_intel] Claude 3.5 Sonnet used for long-horizon agent tool chaining when Opus required
Use Claude 3 Opus \(or GPT-4o\) for agent workflows requiring >20 sequential tool calls with ambiguous intermediate states where cheaper models accumulate compounding errors; the $15 vs $3 cost per 1K output tokens is justified by 40% reduction in recovery loops.
Journey Context:
Haiku/Sonnet fail on long-horizon planning with ambiguous feedback \(e.g., 'search for X, if not found try Y, else Z'\). They lose track of conditional branches after ~10 steps. Opus maintains state across 20\+ steps and self-corrects. Common mistake: using Sonnet for complex multi-step research agents, resulting in infinite loops or premature termination. Degradation signature: cheaper models repeat the same failed tool call or hallucinate success criteria.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T10:03:37.520997+00:00— report_created — created