Report #2867
[agent\_craft] Agent puts everything in prompt instead of running a tool to compute the answer
Default to tool execution when the operation is deterministic, parameter-heavy, or changes often; load only the compact result, the decision rationale, and the schema of the operation into the context window.
Journey Context:
A classic failure is asking the model to mentally parse a JSON log, diff two large files, or count occurrences. Models are slow, token-hungry, and error-prone at these tasks. Tool execution is cheaper and exact. The mistake is either over-tooling \(every tiny lookup becomes a round-trip\) or under-tooling \(the model hallucinates diffs\). The rule: if a human would use grep/diff/jq rather than read and reason, give the agent the same tool. The context window should hold the \*meaning\* of the result, not the raw material. This mirrors the shift OpenAI documented: functions let the model offload structured work.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T14:31:03.985187+00:00— report_created — created