深度神经网络(DNN)是深度学习的一种框架,它是一种具备至少一个隐层的神经网络。与浅层神经网络类似,深度神经网络也能够为复杂非线性系统提供建模,但多出的层次为模型提供了更高的抽象层次,因而提高了模型的能力。

VIP内容

Using Prior Knowledge to Guide BERT's Attention in Semantic Textual Matching Tasks

Authors: Tingyu Xia, Yue Wang, Yuan Tian, Yi Chang

我们研究了将先验知识整合到基于深度Transformer的模型中的问题,即:,以增强其在语义文本匹配任务中的性能。通过探索和分析BERT在解决这个任务时已经知道的东西,我们可以更好地理解BERT最需要什么特定任务的知识,在哪里最需要什么知识。这一分析进一步促使我们采取一种不同于大多数现有工作的方法。我们没有使用先验知识来创建一个新的训练任务来微调BERT,而是直接将知识注入BERT特的多头注意机制。这将我们引向一种简单而有效的方法,它历经快速训练阶段,因为它节省了模型在主要任务以外的额外数据或任务上的训练。大量的实验表明,本文提出的知识增强的BERT模型能够持续地提高语义文本匹配性能,并且在训练数据稀缺的情况下性能效益最为显著。

https://www.zhuanzhi.ai/paper/7b48ad08e4eaf1a9d87baf6474bec12f

成为VIP会员查看完整内容
0
8

最新论文

Speech emotion recognition is a crucial problem manifesting in a multitude of applications such as human computer interaction and education. Although several advancements have been made in the recent years, especially with the advent of Deep Neural Networks (DNN), most of the studies in the literature fail to consider the semantic information in the speech signal. In this paper, we propose a novel framework that can capture both the semantic and the paralinguistic information in the signal. In particular, our framework is comprised of a semantic feature extractor, that captures the semantic information, and a paralinguistic feature extractor, that captures the paralinguistic information. Both semantic and paraliguistic features are then combined to a unified representation using a novel attention mechanism. The unified feature vector is passed through a LSTM to capture the temporal dynamics in the signal, before the final prediction. To validate the effectiveness of our framework, we use the popular SEWA dataset of the AVEC challenge series and compare with the three winning papers. Our model provides state-of-the-art results in the valence and liking dimensions.

0
0
下载
预览
Top