Our objective is to introduce to the NLP community an existing k-NN search library NMSLIB, a new retrieval toolkit FlexNeuART, as well as their integration capabilities. NMSLIB, while being one the fastest k-NN search libraries, is quite generic and supports a variety of distance/similarity functions. Because the library relies on the distance-based structure-agnostic algorithms, it can be further extended by adding new distances. FlexNeuART is a modular, extendible and flexible toolkit for candidate generation in IR and QA applications, which supports mixing of classic and neural ranking signals. FlexNeuART can efficiently retrieve mixed dense and sparse representations (with weights learned from training data), which is achieved by extending NMSLIB. In that, other retrieval systems work with purely sparse representations (e.g., Lucene), purely dense representations (e.g., FAISS and Annoy), or only perform mixing at the re-ranking stage.
翻译:我们的目标是向国家图书馆社区介绍现有的K-NN搜索图书馆NMSLIB,这是一个新的检索工具包FlexNeuART及其集成能力。虽然NMSLIB是最快的 kNN搜索图书馆之一,但相当通用,支持各种远程/相似功能。由于图书馆依赖基于远程的结构 -- -- 不可知算法,因此可以通过增加新的距离来进一步扩大图书馆。FlexNeuART是一个模块、可扩展和灵活的工具包,供IR和QA应用程序的候选组群使用,支持传统和神经级信号的混合。FlexNeuART可以有效地检索混合的密集和稀少的显示(从培训数据中学习重量),这是通过扩展NMSLIB实现的。在这方面,其他检索系统的工作形式非常少(例如Lucene)、纯密度的显示(例如FASIS和Annoy),或者只在重新排位阶段进行混合。