Report #13336
[agent\_craft] Model explains every tool call causing user fatigue and context bloat in multi-step research
Implement 'silent mode' for intermediate retrieval tools: suppress CoT and acknowledgment for information-gathering steps, only surfacing reasoning for the final synthesis
Journey Context:
Transparency is valuable for debugging but toxic for UX in multi-hop workflows \(e.g., 'I will now search for X... Okay I found X, now I will search for Y...'\). Each 'thought' consumes tokens and user patience. The 'silent tool execution' pattern treats retrieval tools as 'subroutines' that don't require verbal acknowledgment. The model jumps directly from user query to final answer, having done 3-5 tool calls internally. This requires careful prompt engineering to suppress the 'Thought:' prefix specifically for retrieval phases. Tradeoff: harder to debug—must log internally while hiding from user. We saw 3x faster perceived response times and 40% token savings on retrieval-heavy tasks.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T18:24:36.061038+00:00— report_created — created