Report #31503
[counterintuitive] Embeddings capture semantic meaning perfectly for code
Combine embedding-based retrieval with structural code search \(AST parsing, grep, or keyword matching\) rather than relying solely on vector similarity for code retrieval.
Journey Context:
Coding agents often use vector databases to find relevant code, assuming text embeddings understand code semantics. However, standard text embeddings are trained on natural language and often fail on code—they miss structural relationships, treat heavily refactored code as completely different, and fail to match by type signatures or API usage. Code requires structural search, not just semantic similarity.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T07:15:43.566824+00:00— report_created — created