Open-Domain QA 阅读器能像人类一样有效地利用外部知识吗? (Can Open-Domain QA Reader Utilize External Knowledge Efficiently like Humans?)

Recent state-of-the-art open-domain QA models are typically based on a two stage retriever-reader approach in which the retriever first finds the relevant knowledge/passages and the reader then leverages that to predict the answer. Prior work has shown that the performance of the reader usually tends to improve with the increase in the number of these passages. Thus, state-of-the-art models use a large number of passages (e.g. 100) for inference. While the reader in this approach achieves high prediction performance, its inference is computationally very expensive. We humans, on the other hand, use a more efficient strategy while answering: firstly, if we can confidently answer the question using our already acquired knowledge then we do not even use the external knowledge, and in the case when we do require external knowledge, we don't read the entire knowledge at once, instead, we only read that much knowledge that is sufficient to find the answer. Motivated by this procedure, we ask a research question "Can the open-domain QA reader utilize external knowledge efficiently like humans without sacrificing the prediction performance?" Driven by this question, we explore an approach that utilizes both 'closed-book' (leveraging knowledge already present in the model parameters) and 'open-book' inference (leveraging external knowledge). Furthermore, instead of using a large fixed number of passages for open-book inference, we dynamically read the external knowledge in multiple 'knowledge iterations'. Through comprehensive experiments on NQ and TriviaQA datasets, we demonstrate that this dynamic reading approach improves both the 'inference efficiency' and the 'prediction accuracy' of the reader. Comparing with the FiD reader, this approach matches its accuracy by utilizing just 18.32% of its reader inference cost and also outperforms it by achieving up to 55.10% accuracy on NQ Open.

翻译：最新状态的开放域域 QA 模型通常基于两个阶段的检索器阅读器方法, 检索器首先发现相关的知识/路径, 而读者随后又利用这一方法来预测答案。先前的工作显示, 读者的性能通常会随着这些段落数量的增加而改善。因此, 最新状态的模型使用大量开放段落( 例如 100) 来推断。虽然这个方法的读者取得了高水平的预测性能, 但它的推断是计算成本很高的。另一方面, 我们人类在回答时, 使用一种效率更高的战略: 首先, 如果我们能够自信地回答问题, 使用我们已经获得的知识, 那么我们甚至没有使用外部知识, 而当我们确实需要外部知识的时候, 我们只是一次读到很多关于它的知识, 而通过这个程序, 我们通过一个研究问题, “ 公开读到这个版本, 读到这个版本的大型知识, 就像在人类的预测中, 正在使用一个固定目录, 使用我们现有的固定目录, 读到一个正常的, 使用我们用一个固定目录的, 读到一个固定的。