This paper introduces a method called Sparsified Late Interaction for Multi-vector retrieval with inverted indexes (SLIM). Although multi-vector models have demonstrated their effectiveness in various information retrieval tasks, most of their pipelines require custom optimization to be efficient in both time and space. Among them, ColBERT is probably the most established method which is based on the late interaction of contextualized token embeddings of pre-trained language models. Unlike ColBERT where all its token embeddings are low-dimensional and dense, SLIM projects each token embedding into a high-dimensional, sparse lexical space before performing late interaction. In practice, we further propose to approximate SLIM using the lower- and upper-bound of the late interaction to reduce latency and storage. In this way, the sparse outputs can be easily incorporated into an inverted search index and are fully compatible with off-the-shelf search tools such as Pyserini and Elasticsearch. SLIM has competitive accuracy on information retrieval benchmarks such as MS MARCO Passages and BEIR compared to ColBERT while being much smaller and faster on CPUs. Source code and data will be available at https://github.com/castorini/pyserini/blob/master/docs/experiments-slim.md.
翻译:本文介绍了一种名为“ 分解的“ 深层互动” 的多矢量检索方法( SLIM ) 。 虽然多矢量模型在各种信息检索任务中显示了其有效性, 但大多数管道需要定制优化, 以便在时间和空间上高效使用。 其中, ColBERT 可能是最常用的方法, 其基础是预先培训语言模型背景化符号嵌入器的延迟互动。 不同于ColBERT, 其所有象征性嵌入器都是低维和密集的, SLIM 项目在进行晚间互动之前, 每一个象征性地嵌入一个高维、 稀薄的词汇空间。 实际上, 我们进一步提议使用较低和上层的延迟互动来接近 SLIM 。 这样, 稀少的产出可以很容易地纳入一个自转式搜索索引, 并且与Pyserriini 和Elsciscastrial研究等现成的搜索工具完全兼容性。 SLIM 在信息检索基准, 如 MS MARCO Passages/BERTERT 和CERTERT 上的数据将大大缩小和更快地/ 。