Report #39379
[cost\_intel] Including many few-shot examples in every API call, silently multiplying token costs
Cap few-shot examples at 2-3 high-quality demonstrations; if you need more than 5, migrate to fine-tuning — each included example adds 500-2000 tokens per call across every request
Journey Context:
Developers routinely include 10-20 few-shot examples as insurance, adding 10K-40K tokens to every single API call. At millions of calls, this inflates costs 3-5x for marginal quality gain. Research on in-context learning consistently shows diminishing returns after 2-3 well-chosen examples for most tasks. The economics are brutal: 15 examples × 1000 tokens × 1M calls = 15B input tokens, which at $3/M is $45,000 in few-shot example tokens alone. Fine-tuning on 500 examples costs roughly $50-200 one-time and absorbs that knowledge into model weights permanently. The breakeven is typically around 5K-10K calls depending on your example length.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T20:34:19.697502+00:00— report_created — created