A quality abstractive summary should not only copy salient source texts as summaries but should also tend to generate new conceptual words to express concrete details. Inspired by the popular pointer generator sequence-to-sequence model, this paper presents a concept pointer network for improving these aspects of abstractive summarization. The network leverages knowledge-based, context-aware conceptualizations to derive an extended set of candidate concepts. The model then points to the most appropriate choice using both the concept set and original source text. This joint approach generates abstractive summaries with higher-level semantic concepts. The training model is also optimized in a way that adapts to different data, which is based on a novel method of distantly-supervised learning guided by reference summaries and testing set. Overall, the proposed approach provides statistically significant improvements over several state-of-the-art models on both the DUC-2004 and Gigaword datasets. A human evaluation of the model's abstractive abilities also supports the quality of the summaries produced within this framework.
翻译:高质量的抽象摘要不仅应当将显著源代码文本复制为摘要,而且还应当倾向于产生新的概念词来表达具体细节。在流行的指针生成者顺序到顺序模型的启发下,本文件提出了一个改进抽象总结的这些方面的概念提示网络。网络利用基于知识的、符合背景的理念概念来得出一系列扩展的候选概念。模型然后用概念集和原始源代码文本指出最合适的选择。这种联合方法产生带有更高层次语义概念的抽象摘要。培训模式还优化于适应不同数据的方式,该培训模式以参考摘要和测试集指导的远程监督学习的新方法为基础。总体而言,拟议方法在统计上大大改进了关于DUC-2004和Gigaworde数据集的若干最新模型。对模型的抽象能力进行的人类评估也支持了在这一框架内制作的概要的质量。