Report #84813
[architecture] Agent hallucinates specific IDs or exact codes when using vector search
Use hybrid search \(combining BM25/keyword search with vector/dense search\) or maintain a separate relational lookup for structured data like IDs, dates, and codes.
Journey Context:
Embeddings compress text into semantic space, destroying exact token sequences. A vector search for order ID 'XJ-882' might return 'XJ-883' because they are semantically adjacent. For structured, exact-match data, semantic search is the wrong tool. Hybrid search or dual-store architectures ensure exact matches are prioritized over semantic approximations.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T00:56:50.438929+00:00— report_created — created