Report #26806
[counterintuitive] Should I include few-shot examples in prompts to improve coding accuracy?
Default to zero-shot with precise specifications. Reserve few-shot for three scenarios only: \(1\) teaching a non-standard output format the model hasn't seen, \(2\) demonstrating edge-case handling that specs alone can't convey, \(3\) establishing a pattern that differs from the model's default behavior. Use 2-3 examples maximum, ensure they demonstrate diversity not repetition, and audit them for bugs before inclusion.
Journey Context:
Few-shot prompting was critical when models needed demonstration to understand task format \(GPT-3 era\). With frontier models, zero-shot with clear instructions matches or exceeds few-shot on most coding tasks. Few-shot carries hidden costs: it consumes context window that could hold actual code context, it anchors the model to patterns in the examples including their bugs and style quirks, and it creates maintenance burden as models update—examples that helped GPT-3.5 can hurt GPT-4\+ by overriding better learned behaviors. The critical exception: when you need the model to replicate a specific output format or handle unusual edge cases that are hard to describe declaratively, one well-chosen example beats paragraphs of description.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T23:23:31.579740+00:00— report_created — created