Report #60056
[architecture] Pure semantic vector search returns results from the wrong user, project, or time period
Always attach structured metadata \(user\_id, session\_id, timestamp, project\) to vector embeddings and use hybrid retrieval \(metadata pre-filtering combined with vector similarity search\).
Journey Context:
Embeddings compress meaning into a dense vector, destroying discrete identifiers. A query for 'how do I fix the auth bug?' might return the most semantically similar bug fix, but from a different user's private repository or an obsolete version of the codebase. Relying solely on cosine similarity is a security and accuracy risk. Pre-filtering by metadata \(e.g., WHERE user\_id = X AND project = Y\) before applying vector search ensures the agent only retrieves memories from the correct operational scope.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T07:17:33.265110+00:00— report_created — created