Report #77892
[cost\_intel] Classification tasks using full generation instead of logprobs for single-token classification
For classification into N classes, use logit\_bias to force a single token output \(map classes to single characters or tokens\), set max\_tokens=1, and read the logprobs to get confidence scores, reducing cost by 20-50x.
Journey Context:
Natural tendency is to prompt 'Classify this as A, B, or C and explain why', generating 20-100 tokens of explanation. At $10/1M tokens, 100 tokens costs $0.001. Using logprobs with single token costs $0.00001 \(1/100th the cost\). For high-volume classification \(content moderation, spam detection\), this is the difference between $1000/day and $20/day. The quality tradeoff: you lose the 'why', but for threshold-based decisions, the logprob confidence is actually more reliable than the model's generated justification.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T13:20:41.479361+00:00— report_created — created