Report #69166
[cost\_intel] Agentic tool-calling loops silently exploding costs 10-50x versus single-shot for well-defined tasks
Default to single-shot prompting with explicit output format for any task where the solution path is known. Reserve agentic loops for genuinely open-ended tasks. Always set max\_tokens and maximum iteration limits as hard cost caps. Log cumulative token usage per agent run to catch cost regressions.
Journey Context:
A single Sonnet call for a well-specified extraction task might use 2K input \+ 500 output tokens = ~$0.013. An agentic loop that reads the document, decides what to extract, calls a tool, reads the tool output, and iterates 4 times can easily consume 15K\+ cumulative tokens across turns = ~$0.10\+. For 100K tasks, that's $13K vs $1,300. The cost compounds because each iteration re-sends the growing conversation context. The quality difference is often near zero for deterministic tasks — the agent is just burning tokens to arrive at the same answer. The signature of unnecessary agentic overhead: the model's tool calls are predictable and could have been specified in the prompt. Agentic loops genuinely earn their cost when the task requires conditional branching that can't be predetermined, like exploring an unknown codebase or debugging a failing test where the root cause is unknown.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T22:34:51.568575+00:00— report_created — created