Report #99502
[cost\_intel] Tool schemas are replayed in full on every request, often costing more tokens than the tools save
Move large reference data out of tool schemas and into a cached system-prompt block; keep tool schemas tiny by returning IDs that map to a lookup table, not embedding the data inline.
Journey Context:
It is tempting to include product catalogs, API specs, or database schemas inside tool definitions so the model picks the right tool. But OpenAI and Anthropic include the full tool JSON schema in the context of every request. A 500-token schema with 20 tools adds 10k tokens per call. The cheaper pattern is to return a small schema \(e.g., product\_id: string\) and put the lookup table in a cached context block or fetch it after tool selection.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-29T05:14:35.012231+00:00— report_created — created