Using Prior Knowledge to Guide BERT's Attention in Semantic Textual Matching Tasks
Authors: Tingyu Xia, Yue Wang, Yuan Tian, Yi Chang
我们研究了将先验知识整合到基于深度Transformer的模型中的问题,即:,以增强其在语义文本匹配任务中的性能。通过探索和分析BERT在解决这个任务时已经知道的东西,我们可以更好地理解BERT最需要什么特定任务的知识,在哪里最需要什么知识。这一分析进一步促使我们采取一种不同于大多数现有工作的方法。我们没有使用先验知识来创建一个新的训练任务来微调BERT,而是直接将知识注入BERT特的多头注意机制。这将我们引向一种简单而有效的方法,它历经快速训练阶段,因为它节省了模型在主要任务以外的额外数据或任务上的训练。大量的实验表明,本文提出的知识增强的BERT模型能够持续地提高语义文本匹配性能,并且在训练数据稀缺的情况下性能效益最为显著。
https://www.zhuanzhi.ai/paper/7b48ad08e4eaf1a9d87baf6474bec12f
Speech emotion recognition is a crucial problem manifesting in a multitude of applications such as human computer interaction and education. Although several advancements have been made in the recent years, especially with the advent of Deep Neural Networks (DNN), most of the studies in the literature fail to consider the semantic information in the speech signal. In this paper, we propose a novel framework that can capture both the semantic and the paralinguistic information in the signal. In particular, our framework is comprised of a semantic feature extractor, that captures the semantic information, and a paralinguistic feature extractor, that captures the paralinguistic information. Both semantic and paraliguistic features are then combined to a unified representation using a novel attention mechanism. The unified feature vector is passed through a LSTM to capture the temporal dynamics in the signal, before the final prediction. To validate the effectiveness of our framework, we use the popular SEWA dataset of the AVEC challenge series and compare with the three winning papers. Our model provides state-of-the-art results in the valence and liking dimensions.