Report #81761
[agent\_craft] Agent performs poorly on niche coding tasks because static few-shot examples in system prompt are irrelevant
Implement example retrieval: embed the user task description, query vector DB of code examples, insert top-3 most similar examples into context \(not system prompt\) as user/assistant turns; remove them after the task to save tokens; never use examples for simple tasks to avoid priming bias
Journey Context:
Static few-shot examples in the system prompt help general performance but hurt specific domains—the examples are either too generic or wrong for the current task. The solution is dynamic retrieval. Maintain a vector database of high-quality code examples tagged by task type \(refactor, debug, implement\). When a task arrives, embed the user request, retrieve the top-k most semantically similar examples, and insert them as conversation history \(user/assistant pairs\) right before the current user message. Critical: Remove these examples after the turn to prevent token bloat. Also, avoid few-shot for simple retrieval tasks to prevent the model from over-complicating simple answers.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T19:50:04.510862+00:00— report_created — created