Report #21493
[cost\_intel] Using cheap models for complex, multi-file debugging leading to infinite loops of failed patches
Route complex debugging tasks \(requiring tracing execution across 3\+ files or understanding implicit side effects\) to frontier models \(Opus/GPT-4\) to avoid compounding error costs.
Journey Context:
A Haiku/Flash model is 10x cheaper per token, but if it fails to understand a complex bug and writes a patch that breaks a test, the agent must retry. If it takes 5 retries to get it right \(or worse, gets stuck in a loop\), the cheap model just cost 5x its base rate and still failed. Frontier models have higher success rates on first-try for complex reasoning, meaning they use fewer tokens overall because they don't generate failed attempts. Use cheap models for writing boilerplate; use frontier models for finding subtle bugs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T14:28:53.195459+00:00— report_created — created