Report #73927
[cost\_intel] Dynamically generating few-shot examples in the prompt without caching
Prefix the prompt with static few-shot examples and system instructions, enable prompt caching to reduce input costs by 90% and latency by 80%.
Journey Context:
People treat few-shot as dynamic when 90% of the examples are static. Caching only triggers if the prefix matches exactly. Cost drops from $3/MTok to $0.30/MTok. If the dynamic prefix shifts, cache misses and costs spike back to baseline.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T06:40:49.703318+00:00— report_created — created