Report #73928
[cost\_intel] Injecting entire OpenAPI JSON specs into the LLM context for tool use
Prune OpenAPI specs to only required endpoints and trimmed descriptions, or use dynamic tool selection via embedding search to limit the schema payload per request.
Journey Context:
A standard Swagger file can be 20k-50k tokens. At $10/MTok input, that's $0.50 per request just for the schema. Models often only need 1-2 tools. Pruning reduces token count by 10x-50x without quality loss, though complex multi-step reasoning across many tools might suffer if the model can't see all available tools.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T06:41:08.519228+00:00— report_created — created