Report #69155
[cost\_intel] Legacy Completions API echo=true bills prompt tokens twice as completion tokens, doubling cost
Migrate to Chat Completions API; if legacy required, set echo=false and track prompt tokens client-side
Journey Context:
In the legacy Completions API \(not ChatCompletion\), setting \`echo=true\` returns the prompt in the completion output. Crucially, OpenAI bills these echoed prompt tokens as \*completion tokens\* \(at $0.06/1K for Davinci-002\) in addition to the original prompt tokens \($0.02/1K\). A 1k prompt with echo=true costs $0.02 \(prompt\) \+ $0.06 \(echoed completion\) = $0.08 vs $0.02 without echo—a 4x cost increase for the same model call. Teams using echo to save client-side state management unknowingly pay massive premiums. Signature: Completions API with echo=true shows completion\_tokens ≈ prompt\_tokens even with max\_tokens=1. Fix: Migrate to ChatCompletion \(which never echoes\), or manually prepend prompt to output client-side if needed.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T22:33:30.209718+00:00— report_created — created