开放式词汇关键词点点的学习音频文本协议 (Learning Audio-Text Agreement for Open-vocabulary Keyword Spotting) - 专知论文

会员服务 ·

0

Learning · 端到端 · 损失 · 稳健性 · 数据集 ·

2022 年 6 月 30 日

Learning Audio-Text Agreement for Open-vocabulary Keyword Spotting

翻译：开放式词汇关键词点点的学习音频文本协议

Hyeon-Kyeong Shin,Hyewon Han,Doyeon Kim,Soo-Whan Chung,Hong-Goo Kang

from arxiv, Accepted to Interspeech 2022

In this paper, we propose a novel end-to-end user-defined keyword spotting method that utilizes linguistically corresponding patterns between speech and text sequences. Unlike previous approaches requiring speech keyword enrollment, our method compares input queries with an enrolled text keyword sequence. To place the audio and text representations within a common latent space, we adopt an attention-based cross-modal matching approach that is trained in an end-to-end manner with monotonic matching loss and keyword classification loss. We also utilize a de-noising loss for the acoustic embedding network to improve robustness in noisy environments. Additionally, we introduce the LibriPhrase dataset, a new short-phrase dataset based on LibriSpeech for efficiently training keyword spotting models. Our proposed method achieves competitive results on various evaluation sets compared to other single-modal and cross-modal baselines.

翻译：在本文中,我们提出一种新的端到端用户定义关键字识别方法,该方法使用语音和文本序列之间的语言对应模式。与以往要求语音关键字注册的方法不同,我们的方法将输入询问与注册文本关键字序列进行比较。为了将音频和文本表达方式置于共同的潜在空间内,我们采用了一种基于注意的跨模式匹配方法,该方法以端到端方式培训,处理单声匹配损失和关键词分类损失。我们还利用声频嵌入网络去响损失,以提高噪音环境中的稳健性。此外,我们引入了LibriPhrase数据集,这是一套基于LibriSpeech的新的短句数据集,用于高效培训关键词识别模型。我们提议的方法在各种评价组合上取得了与其他单一模式和交叉模式基线相比的竞争性结果。

0

相关内容

Learning

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

基于孪生随机形核的镁合金细观塑性本构与损伤行为研究

国家自然科学基金

0+阅读 · 2014年12月31日

PTHrP NLS与C末端促进小鼠脊髓损伤后髓鞘再生的作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

采用原位同步辐射衍射研究纳米结构Cu/Ag多层膜的微机械行为

国家自然科学基金

0+阅读 · 2013年12月31日

几何结构形变空间的几何拓扑

国家自然科学基金

0+阅读 · 2012年12月31日

Mg-Al-Ca-Sr镁合金热变形行为的位错机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

层topos中的拓扑结构与序结构

国家自然科学基金

0+阅读 · 2011年12月31日

NiCrW高温合金Ni2(Cr,W)超点阵结构相变机制与热稳定性研究

国家自然科学基金

0+阅读 · 2011年12月31日

Period2基因调控人胶质瘤细胞凋亡的分子机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

建立抗疱疹病毒中草药提取物的AN高通量分子筛选模型

国家自然科学基金

0+阅读 · 2010年12月31日

TRAF2介导c-FLIP活化NF-κ#20449;号通路调控银屑病局部增殖和炎症的作用及其机制

国家自然科学基金

0+阅读 · 2009年12月31日

Open-Vocabulary Panoptic Segmentation with MaskCLIP

Arxiv

0+阅读 · 2022年8月18日

MMER: Multimodal Multi-task learning for Emotion Recognition in Spoken Utterances

MMER: Multimodal Multi-task learning for Emotion Recognition in Spoken Utterances

Arxiv

0+阅读 · 2022年8月18日

Text-DIAE: A Self-Supervised Degradation Invariant Autoencoders for Text Recognition and Document Enhancement

Arxiv

0+阅读 · 2022年8月18日

Few-shot Learning for Multi-label Intent Detection

Arxiv

21+阅读 · 2020年10月11日

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Arxiv

26+阅读 · 2020年2月10日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Arxiv

17+阅读 · 2019年9月9日

CAN-NER: Convolutional Attention Network forChinese Named Entity Recognition

Arxiv

16+阅读 · 2019年4月3日

Learning Embedding Adaptation for Few-Shot Learning

Learning Embedding Adaptation for Few-Shot Learning

Arxiv

17+阅读 · 2018年12月10日

Deep Active Learning for Named Entity Recognition

Arxiv

15+阅读 · 2018年2月4日

Zero-Shot Transfer Learning for Event Extraction

Arxiv

10+阅读 · 2017年7月4日

VIP会员

文章信息

相关主题

相关VIP内容

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

新型数字杀伤链：理解综合战术网络对野战炮兵体系的能力与效益

《对抗环境中运用数字孪生技术优化预测性维护与后勤保障》2025最新93页

《任务式指挥十六个案例研究》232页

《幻觉还是事实：国防大型语言模型的可信度评估研究》2025最新109页

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Open-Vocabulary Panoptic Segmentation with MaskCLIP

Arxiv

0+阅读 · 2022年8月18日

MMER: Multimodal Multi-task learning for Emotion Recognition in Spoken Utterances

MMER: Multimodal Multi-task learning for Emotion Recognition in Spoken Utterances

Arxiv

0+阅读 · 2022年8月18日

Text-DIAE: A Self-Supervised Degradation Invariant Autoencoders for Text Recognition and Document Enhancement

Arxiv

0+阅读 · 2022年8月18日

Few-shot Learning for Multi-label Intent Detection

Arxiv

21+阅读 · 2020年10月11日

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Arxiv

26+阅读 · 2020年2月10日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Arxiv

17+阅读 · 2019年9月9日

CAN-NER: Convolutional Attention Network forChinese Named Entity Recognition

Arxiv

16+阅读 · 2019年4月3日

Learning Embedding Adaptation for Few-Shot Learning

Learning Embedding Adaptation for Few-Shot Learning

Arxiv

17+阅读 · 2018年12月10日

Deep Active Learning for Named Entity Recognition

Arxiv

15+阅读 · 2018年2月4日

Zero-Shot Transfer Learning for Event Extraction

Arxiv

10+阅读 · 2017年7月4日

相关基金

基于孪生随机形核的镁合金细观塑性本构与损伤行为研究

国家自然科学基金

0+阅读 · 2014年12月31日

PTHrP NLS与C末端促进小鼠脊髓损伤后髓鞘再生的作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

采用原位同步辐射衍射研究纳米结构Cu/Ag多层膜的微机械行为

国家自然科学基金

0+阅读 · 2013年12月31日

几何结构形变空间的几何拓扑

国家自然科学基金

0+阅读 · 2012年12月31日

Mg-Al-Ca-Sr镁合金热变形行为的位错机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

层topos中的拓扑结构与序结构

国家自然科学基金

0+阅读 · 2011年12月31日

NiCrW高温合金Ni2(Cr,W)超点阵结构相变机制与热稳定性研究

国家自然科学基金

0+阅读 · 2011年12月31日

Period2基因调控人胶质瘤细胞凋亡的分子机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

建立抗疱疹病毒中草药提取物的AN高通量分子筛选模型

国家自然科学基金

0+阅读 · 2010年12月31日

TRAF2介导c-FLIP活化NF-κ#20449;号通路调控银屑病局部增殖和炎症的作用及其机制

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员