Pay更多关注-问题解答神经结构 (Pay More Attention - Neural Architectures for Question-Answering)

Machine comprehension is a representative task of natural language understanding. Typically, we are given context paragraph and the objective is to answer a question that depends on the context. Such a problem requires to model the complex interactions between the context paragraph and the question. Lately, attention mechanisms have been found to be quite successful at these tasks and in particular, attention mechanisms with attention flow from both context-to-question and question-to-context have been proven to be quite useful. In this paper, we study two state-of-the-art attention mechanisms called Bi-Directional Attention Flow (BiDAF) and Dynamic Co-Attention Network (DCN) and propose a hybrid scheme combining these two architectures that gives better overall performance. Moreover, we also suggest a new simpler attention mechanism that we call Double Cross Attention (DCA) that provides better results compared to both BiDAF and Co-Attention mechanisms while providing similar performance as the hybrid scheme. The objective of our paper is to focus particularly on the attention layer and to suggest improvements on that. Our experimental evaluations show that both our proposed models achieve superior results on the Stanford Question Answering Dataset (SQuAD) compared to BiDAF and DCN attention mechanisms.

翻译：通常,我们被给上下文段落,目标是回答一个取决于背景的问题。这样一个问题需要模拟上下文段落和问题之间的复杂互动。最近,人们发现注意机制在这些任务中相当成功,特别是关注机制,从上下文到问题和问题到文字都得到了关注,事实证明注意机制是非常有用的。在本文件中,我们研究了两个最先进的关注机制,即双调关注流动(BIDAF)和动态共同注意网络(DCN),并提出了一种混合计划,将这两个结构结合起来,使总体业绩得到改善。此外,我们还提出一个新的更简单的关注机制,即我们称之为双交叉注意(DCA),提供更好的结果,与BIDAF和共同注意机制相比,同时提供与混合计划类似的业绩。我们的文件的目的是特别侧重于关注层,并对此提出改进建议。我们的实验性评估表明,我们提议的模型在斯坦福问题解答数据系统(SQADAD)和DAF机制之间都取得了优异的结果。

相关内容

注意力机制

关注 120

Attention机制最早是在视觉图像领域提出来的，但是真正火起来应该算是google mind团队的这篇论文《Recurrent Models of Visual Attention》[14]，他们在RNN模型上使用了attention机制来进行图像分类。随后，Bahdanau等人在论文《Neural Machine Translation by Jointly Learning to Align and Translate》 [1]中，使用类似attention的机制在机器翻译任务上将翻译和对齐同时进行，他们的工作算是是第一个提出attention机制应用到NLP领域中。接着类似的基于attention机制的RNN模型扩展开始应用到各种NLP任务中。最近，如何在CNN中使用attention机制也成为了大家的研究热点。下图表示了attention研究进展的大概趋势。

【ICLR 2019】双曲注意力网络，Hyperbolic Attention Network

专知会员服务

84+阅读 · 2020年6月21日

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

专知会员服务

111+阅读 · 2020年6月10日