联合优化查询编码器和产品量化以提高检索性能 (Jointly Optimizing Query Encoder and Product Quantization to Improve Retrieval Performance)

Recently, Information Retrieval community has witnessed fast-paced advances in Dense Retrieval (DR), which performs first-stage retrieval by encoding documents in a low-dimensional embedding space and querying them with embedding-based search. Despite the impressive ranking performance, previous studies usually adopt brute-force search to acquire candidates, which is prohibitive in practical Web search scenarios due to its tremendous memory usage and time cost. To overcome these problems, vector compression methods, a branch of Approximate Nearest Neighbor Search (ANNS), have been adopted in many practical embedding-based retrieval applications. One of the most popular methods is Product Quantization (PQ). However, although existing vector compression methods including PQ can help improve the efficiency of DR, they incur severely decayed retrieval performance due to the separation between encoding and compression. To tackle this problem, we present JPQ, which stands for Joint optimization of query encoding and Product Quantization. It trains the query encoder and PQ index jointly in an end-to-end manner based on three optimization strategies, namely ranking-oriented loss, PQ centroid optimization, and end-to-end negative sampling. We evaluate JPQ on two publicly available retrieval benchmarks. Experimental results show that JPQ significantly outperforms existing popular vector compression methods in terms of different trade-off settings. Compared with previous DR models that use brute-force search, JPQ almost matches the best retrieval performance with 30x compression on index size. The compressed index further brings 10x speedup on CPU and 2x speedup on GPU in query latency.

翻译：最近,信息检索社区在 " 高级检索 " (DR)中取得了快速进展,通过在低维嵌入空间对文件进行编码,对文件进行第一阶段检索,并在嵌入搜索中进行查询。尽管排名表现令人印象深刻,以往的研究通常采用粗力搜索来获取候选人,这在实际的网络搜索情景中是令人望而却步的。为了克服这些问题,在许多实用的嵌入式检索应用程序中采用了矢量压缩方法(ANNS),这是近近邻搜索的分支。最受欢迎的方法之一是产品量化(PQ),不过,尽管包括PQ在内的现有矢量压缩方法可以提高编码和压缩之间的效率,但由于编码和压缩之间的分离,它们通常采用粗力搜索搜索方法来获取候选人。为了解决这一问题,我们提出了JPQ,这是联合优化查询编码和产品量化。它根据三种最优化战略,即以前级搜索模式、PQrormoral 的准确度标准,以及现有GDRM标准的最后结果,我们用现有两个压式的GJDR标准来进行测试。