关于信息检索中相关性建模的语言研究 (A Linguistic Study on Relevance Modeling in Information Retrieval)

Relevance plays a central role in information retrieval (IR), which has received extensive studies starting from the 20th century. The definition and the modeling of relevance has always been critical challenges in both information science and computer science research areas. Along with the debate and exploration on relevance, IR has already become a core task in many real-world applications, such as Web search engines, question answering systems, conversational bots, and so on. While relevance acts as a unified concept in all these retrieval tasks, the inherent definitions are quite different due to the heterogeneity of these tasks. This raises a question to us: Do these different forms of relevance really lead to different modeling focuses? To answer this question, in this work, we conduct an empirical study on relevance modeling in three representative IR tasks, i.e., document retrieval, answer retrieval, and response retrieval. Specifically, we attempt to study the following two questions: 1) Does relevance modeling in these tasks really show differences in terms of natural language understanding (NLU)? We employ 16 linguistic tasks to probe a unified retrieval model over these three retrieval tasks to answer this question. 2) If there do exist differences, how can we leverage the findings to enhance the relevance modeling? We proposed three intervention methods to investigate how to leverage different modeling focuses of relevance to improve these IR tasks. We believe the way we study the problem as well as our findings would be beneficial to the IR community.

翻译：相关性在信息检索(IR)中发挥着核心作用,信息检索(IR)从20世纪开始就得到了广泛的研究。相关性的定义和建模始终是信息科学和计算机科学研究领域的关键挑战。随着对相关性的辩论和探索,IR已经成为许多真实世界应用中的核心任务,如网络搜索引擎、答答系统、对谈机器人等。尽管相关性在所有这些检索任务中都是一个统一的概念,但由于这些任务的多样性,内在定义是完全不同的。这给我们提出了一个问题:这些不同的相关性形式是否真的导致不同的建模重点?为了回答这一问题,我们开展了关于三个具有代表性的IR任务(即文件检索、回复检索和回复检索)相关性的经验性研究。具体地说,我们试图研究以下两个问题:(1)这些任务的建模是否真正表明自然语言理解(NLU)方面的差异?我们用16项语言任务来探寻这三项检索任务的统一模型,以回答这一问题。(2)如果存在三种具有代表性的内涵关联性,我们如何利用这些研究的方法来改进我们提出的建模性任务。我们如何利用这些不同的结果来改进我们作为衡量方法。