用于改进检索的未经监督的基于图表的排名汇总 (Unsupervised Graph-based Rank Aggregation for Improved Retrieval)

This paper presents a robust and comprehensive graph-based rank aggregation approach, used to combine results of isolated ranker models in retrieval tasks. The method follows an unsupervised scheme, which is independent of how the isolated ranks are formulated. Our approach is able to combine arbitrary models, defined in terms of different ranking criteria, such as those based on textual, image or hybrid content representations. We reformulate the ad-hoc retrieval problem as a document retrieval of their fusion graph, which we propose as a new unified representation model capable of merging multiple ranks and expressing inter-relationships of retrieval results automatically. By doing so, we claim that the retrieval system can benefit from learning the manifold structure of datasets, thus leading to more effective results. Another contribution is that our graph-based aggregation formulation, unlike existing approaches, allows for encapsulating contextual information encoded from multiple ranks, which can be directly used for ranking, without further computations and processing steps over the graphs. Based on the graphs, a novel similarity retrieval score is formulated using an efficient computation of minimum common subgraphs. Finally, another benefit over existing approaches is the absence of hyperparameters. A comprehensive experimental evaluation was conducted considering diverse well-known public datasets, composed of textual, image, and multimodal documents. Performed experiments demonstrate that our method reaches top performance, yielding better effectiveness scores than state-of-the-art baseline methods and promoting large gains over the rankers being fused, thus showing the successful capability of the proposal in representing queries based on a unified graph-based model of rank fusions.

翻译：本文展示了一种稳健、全面的基于图表的排名汇总方法,用于将孤立的排级模型的结果自动地结合到检索任务中。该方法遵循一种不受监督的体系,它独立于如何制定孤立的排级模型。我们的方法能够将根据不同的排名标准(如基于文本、图像或混合内容表达方式的标准)定义的任意模式结合起来。我们重新配置了以文件检索集级图为文件的特设集级问题。我们建议将其作为一个新的统一代表模式,能够合并多个排级,并自动表达检索结果的相互关系。我们这样做,我们声称检索系统可以从学习数据集的多重结构中获益,从而导致更有效的结果。我们的方法的另一个贡献是,我们基于图表的集级组合设计能够包罗以不同的排名标准(如基于文本、图像或混合内容表达方式)界定背景信息。我们可以直接用于排级,而无需进一步计算和处理图表上的相容积级图。基于模型的新的相似度检索分数是用一个高效的通用子图计算。最后,我们声称,检索系统的另一个好处是,学习数据集的多重结构,因此,展示了以高阶级缩的进度, 展示了一种高级的模型, 展示了我们模型, 展示了一种不同的模型的进度, 测试, 评估, 展示了我们以展示了以展示了比的高级的进度的进度的进度的进度, ; ; 评估是, 展示了以展示了比,, 展示了以展示了以, 展示了以,, 展示了以以展示了以, 的进度的, 以以以以以以以。