Embedding retrieval systems learn a shared semantic representation space for queries and items, enabling efficient retrieval through an approximate nearest-neighbor search. However, current industrial implementations face a critical limitation: using a fixed retrieval cutoff for all queries inevitably compromises performance, yielding insufficient recall for high-frequency (head) queries and reduced precision for low-frequency (tail) queries. This persistent challenge stems fundamentally from the frequentist paradigms dominating existing loss function designs. In this work, we introduce a novel framework, probabilistic Embedding-Based Retrieval (\textbf{pEBR}): maximum likelihood estimation-based and contrastive estimation-based, which learns the underlying probability distribution of relevant items for each query, compute adaptive cosine similarity cutoffs via probabilistic cumulative distribution functions (CDF), and automatically adapts to the distinct characteristics of head vs. tail queries. Experiments and ablation studies demonstrate that pEBR simultaneously improves precision and recall while maintaining computational efficiency.
翻译:暂无翻译