Report #29393
[frontier] Static few-shot prompts causing poor performance on diverse tasks
Implement vector-based retrieval of few-shot examples from a curated memory bank, dynamically selecting relevant demonstrations based on query similarity rather than using fixed examples.
Journey Context:
Hardcoding few-shot examples in prompts fails to cover the long tail of user queries and agent tasks, leading to poor generalization. Random example selection provides inconsistent quality. Dynamic few-shot retrieval embeds the current task/query and retrieves the most semantically similar successful past examples from a vector store. This provides contextually relevant demonstrations that match the specific patterns and edge cases of the current input, significantly improving agent performance on specialized tasks without prompt engineering overhead.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T03:43:43.933177+00:00— report_created — created