Report #77705
[frontier] Few-shot prompts degrading performance due to suboptimal example selection
Use DSPy's BootstrapFewShotWithRandomSearch with execution trace logging; compile pipelines by optimizing against historical execution traces rather than static examples, enabling continuous prompt optimization based on actual production data.
Journey Context:
Static few-shot examples become stale as data distributions shift. Trace-based compilation adapts to actual input distributions and failure modes. Alternative: manual prompt engineering doesn't scale. Tradeoff: Requires maintaining trace databases and compute for compilation steps, but achieves SOTA performance without hand-tuning and automatically adapts to concept drift in production.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T13:01:42.549061+00:00— report_created — created