发言人讲话时的情感认同 (Speaker Attentive Speech Emotion Recognition) - 专知论文

会员服务 ·

0

INFORMS · Performer · 注意力机制 · Networking · 查全率/召回率 ·

2021 年 4 月 15 日

Speaker Attentive Speech Emotion Recognition

翻译：发言人讲话时的情感认同

Clément Le Moine,Nicolas Obin,Axel Roebel

Speech Emotion Recognition (SER) task has known significant improvements over the last years with the advent of Deep Neural Networks (DNNs). However, even the most successful methods are still rather failing when adaptation to specific speakers and scenarios is needed, inevitably leading to poorer performances when compared to humans. In this paper, we present novel work based on the idea of teaching the emotion recognition network about speaker identity. Our system is a combination of two ACRNN classifiers respectively dedicated to speaker and emotion recognition. The first informs the latter through a Self Speaker Attention (SSA) mechanism that is shown to considerably help to focus on emotional information of the speech signal. Experiments on social attitudes database Att-HACK and IEMOCAP corpus demonstrate the effectiveness of the proposed method and achieve the state-of-the-art performance in terms of unweighted average recall.

翻译：过去几年来,随着深神经网络(DNN)的到来,情感识别(SER)的任务有了显著的改善,但是,当需要适应具体的演讲者和情景时,即使是最成功的方法也仍然相当失败,这不可避免地导致与人类相比,表现较差。在本文中,我们介绍了基于教授情绪识别网络关于声音认同的概念的新工作。我们的系统是两个分别致力于语音识别和情感识别的ACRN分类人员的综合体。第一个系统是通过自言自语关注(SSA)机制向后者通报的,这表明它大大有助于关注语音信号的情感信息。关于社会态度数据库Att-HACK和IEMOCAP的实验表明拟议方法的有效性,并实现了无重量平均回想的最先进的表现。

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

【KDD2020教程】多模态网络表示学习

【KDD2020教程】多模态网络表示学习

专知会员服务

128+阅读 · 2020年8月26日

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

专知会员服务

14+阅读 · 2020年5月5日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

37+阅读 · 2020年1月30日

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

专知会员服务

32+阅读 · 2020年1月5日

【论文】使用编码器进行命名实体识别（TENER: Adapting Transformer Encoder for Named Entity Recognition）

【论文】使用编码器进行命名实体识别（TENER: Adapting Transformer Encoder for Named Entity Recognition）

专知会员服务

51+阅读 · 2019年12月28日

【NLP| 推荐文章】语言语音处理（Speech and Language Processing(3rd ed.draft)）

专知会员服务

14+阅读 · 2019年11月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

53+阅读 · 2019年10月17日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

49+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

CCF推荐 | 国际会议信息6条

CCF推荐 | 国际会议信息6条

Call4Papers

9+阅读 · 2019年8月13日

计算机 | CCF推荐期刊专刊信息5条

计算机 | CCF推荐期刊专刊信息5条

Call4Papers

3+阅读 · 2019年4月10日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

语音顶级会议Interspeech2018接受论文列表！

语音顶级会议Interspeech2018接受论文列表！

专知

6+阅读 · 2018年6月10日

已删除

生物探索

3+阅读 · 2018年2月10日

【论文推荐】最新5篇语音识别（ASR）相关论文—音频对抗样本、对抗性语音识别系统、声学模型、序列到序列、口语可理解性矫正

【论文推荐】最新5篇语音识别（ASR）相关论文—音频对抗样本、对抗性语音识别系统、声学模型、序列到序列、口语可理解性矫正

专知

14+阅读 · 2018年2月4日

开源自动语音识别系统wav2letter (附实现教程)

开源自动语音识别系统wav2letter (附实现教程)

七月在线实验室

9+阅读 · 2018年1月8日

【今日新增】IEEE Trans.专刊截稿信息8条

【今日新增】IEEE Trans.专刊截稿信息8条

Call4Papers

7+阅读 · 2017年6月29日

Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition

Arxiv

0+阅读 · 2021年6月8日

Efficient Speech Emotion Recognition Using Multi-Scale CNN and Attention

Arxiv

0+阅读 · 2021年6月8日

De-STT: De-entaglement of unwanted Nuisances and Biases in Speech to Text System using Adversarial Forgetting

Arxiv

0+阅读 · 2021年6月5日

End-to-End Multi-speaker Speech Recognition with Transformer

Arxiv

8+阅读 · 2020年2月13日

Exploring RNN-Transducer for Chinese Speech Recognition

Arxiv

4+阅读 · 2019年4月23日

Neural Speech Synthesis with Transformer Network

Neural Speech Synthesis with Transformer Network

Arxiv

5+阅读 · 2019年1月30日

Improved Speech Enhancement with the Wave-U-Net

Arxiv

8+阅读 · 2018年11月27日

End-to-end Speech Recognition with Word-based RNN Language Models

End-to-end Speech Recognition with Word-based RNN Language Models

Arxiv

3+阅读 · 2018年8月8日

End-to-End Speech Recognition From the Raw Waveform

Arxiv

3+阅读 · 2018年6月19日

Mitigating the Impact of Speech Recognition Errors on Chatbot using Sequence-to-Sequence Model

Arxiv

4+阅读 · 2017年12月2日

VIP会员

文章信息

相关主题

注意力机制

查全率/召回率

相关VIP内容

【KDD2020教程】多模态网络表示学习

【KDD2020教程】多模态网络表示学习

专知会员服务

128+阅读 · 2020年8月26日

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

专知会员服务

14+阅读 · 2020年5月5日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

37+阅读 · 2020年1月30日

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

专知会员服务

32+阅读 · 2020年1月5日

【论文】使用编码器进行命名实体识别（TENER: Adapting Transformer Encoder for Named Entity Recognition）

【论文】使用编码器进行命名实体识别（TENER: Adapting Transformer Encoder for Named Entity Recognition）

专知会员服务

51+阅读 · 2019年12月28日

【NLP| 推荐文章】语言语音处理（Speech and Language Processing(3rd ed.draft)）

专知会员服务

14+阅读 · 2019年11月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

53+阅读 · 2019年10月17日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

49+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

相关资讯

CCF推荐 | 国际会议信息6条

CCF推荐 | 国际会议信息6条

Call4Papers

9+阅读 · 2019年8月13日

计算机 | CCF推荐期刊专刊信息5条

计算机 | CCF推荐期刊专刊信息5条

Call4Papers

3+阅读 · 2019年4月10日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

语音顶级会议Interspeech2018接受论文列表！

语音顶级会议Interspeech2018接受论文列表！

专知

6+阅读 · 2018年6月10日

已删除

生物探索

3+阅读 · 2018年2月10日

【论文推荐】最新5篇语音识别（ASR）相关论文—音频对抗样本、对抗性语音识别系统、声学模型、序列到序列、口语可理解性矫正

【论文推荐】最新5篇语音识别（ASR）相关论文—音频对抗样本、对抗性语音识别系统、声学模型、序列到序列、口语可理解性矫正

专知

14+阅读 · 2018年2月4日

开源自动语音识别系统wav2letter (附实现教程)

开源自动语音识别系统wav2letter (附实现教程)

七月在线实验室

9+阅读 · 2018年1月8日

【今日新增】IEEE Trans.专刊截稿信息8条

【今日新增】IEEE Trans.专刊截稿信息8条

Call4Papers

7+阅读 · 2017年6月29日

相关论文

Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition

Arxiv

0+阅读 · 2021年6月8日

Efficient Speech Emotion Recognition Using Multi-Scale CNN and Attention

Arxiv

0+阅读 · 2021年6月8日

De-STT: De-entaglement of unwanted Nuisances and Biases in Speech to Text System using Adversarial Forgetting

Arxiv

0+阅读 · 2021年6月5日

End-to-End Multi-speaker Speech Recognition with Transformer

Arxiv

8+阅读 · 2020年2月13日

Exploring RNN-Transducer for Chinese Speech Recognition

Arxiv

4+阅读 · 2019年4月23日

Neural Speech Synthesis with Transformer Network

Neural Speech Synthesis with Transformer Network

Arxiv

5+阅读 · 2019年1月30日

Improved Speech Enhancement with the Wave-U-Net

Arxiv

8+阅读 · 2018年11月27日

End-to-end Speech Recognition with Word-based RNN Language Models

End-to-end Speech Recognition with Word-based RNN Language Models

Arxiv

3+阅读 · 2018年8月8日

End-to-End Speech Recognition From the Raw Waveform

Arxiv

3+阅读 · 2018年6月19日

Mitigating the Impact of Speech Recognition Errors on Chatbot using Sequence-to-Sequence Model

Arxiv

4+阅读 · 2017年12月2日

微信扫码咨询专知VIP会员