Report #35215
[frontier] Static few-shot examples in prompts becoming irrelevant for diverse user queries, reducing agent accuracy
Implement dynamic few-shot selection by embedding the current task trajectory and retrieving historically successful similar trajectories from a vector store to use as adaptive in-context examples.
Journey Context:
Hard-coded examples work for narrow domains but fail when user intent varies widely \(e.g., coding tasks vs. analysis tasks\). The fix is trajectory-aware retrieval: encode the current conversation state \(not just the last query\), search against a database of past agent runs labeled by success/failure, and inject the top-K successful similar trajectories as few-shot examples. This requires maintaining a 'success database' with rich metadata \(task type, tools used, outcome\) and updating the prompt dynamically before each LLM call.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T13:34:54.530553+00:00— report_created — created