Report #68690
[agent\_craft] Static few-shot examples cause style drift and poor generalization across diverse user queries
Maintain a corpus of 50-100 labeled examples; at runtime, embed the user query, retrieve the top-3 semantically similar examples via vector search, and inject those as dynamic few-shot context.
Journey Context:
Static few-shot assumes a single distribution of tasks; in practice, a coding agent sees everything from 'write a regex' to 'refactor a class'. A regex example is noise for a refactoring task. By treating the few-shot selection as a retrieval problem \(KNN in embedding space\), the examples are distributionally matched to the current intent. This 'In-Context Learning as Retrieval' approach reduces token waste \(irrelevant examples are omitted\) and improves task accuracy by 20-30% on heterogeneous benchmarks compared to static 5-shot. The cost is latency for the embedding call and the need to curate a diverse example library \(minimum 30 examples to cover the long tail\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T21:46:47.451223+00:00— report_created — created