大规模学习一般代表制,以表彰发言者 (Large-scale learning of generalised representations for speaker recognition) - 专知论文

会员服务 ·

0

Performer · 声纹识别 · MoDELS · 多样性 · Learning ·

2022 年 10 月 20 日

Large-scale learning of generalised representations for speaker recognition

翻译：大规模学习一般代表制,以表彰发言者

Jee-weon Jung,Hee-Soo Heo,Bong-Jin Lee,Jaesong Lee,Hye-jin Shim,Youngki Kwon,Joon Son Chung,Shinji Watanabe

from arxiv, 5pages, 5 tables, submitted to ICASSP

The objective of this work is to develop a speaker recognition model to be used in diverse scenarios. We hypothesise that two components should be adequately configured to build such a model. First, adequate architecture would be required. We explore several recent state-of-the-art models, including ECAPA-TDNN and MFA-Conformer, as well as other baselines. Second, a massive amount of data would be required. We investigate several new training data configurations combining a few existing datasets. The most extensive configuration includes over 87k speakers' 10.22k hours of speech. Four evaluation protocols are adopted to measure how the trained model performs in diverse scenarios. Through experiments, we find that MFA-Conformer with the least inductive bias generalises the best. We also show that training with proposed large data configurations gives better performance. A boost in generalisation is observed, where the average performance on four evaluation protocols improves by more than 20%. In addition, we also demonstrate that these models' performances can improve even further when increasing capacity.

翻译：这项工作的目标是开发一个语音识别模型,用于不同的假设情景。我们假设, 两个组成部分应该有足够的配置来构建这样的模型。首先, 需要适当的结构。我们探索最近的一些最先进的模型, 包括 ECAPA- TDNN 和 MFA- Confrent, 以及其他基线。第二, 需要大量的数据。我们调查了几个新的培训数据配置, 将几个现有数据集结合起来。最广泛的配置包括87k 语言10.22公里的演讲时间。通过了四个评价协议, 以衡量经过培训的模型在不同情景中是如何运行的。我们通过实验发现, MFA- Coneder 最不具有引入性偏差的概括性能。我们还显示, 与拟议的大数据配置有关的培训效果更好。观察到了总体化的推动, 四个评价协议的平均绩效提高了20%以上。此外, 我们还表明, 这些模型的绩效在提高能力时还可以进一步提高。

0

相关内容

Performer

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

易回收磷钼酸铵基高效Cs+捕集纳米复合材料的制备、表征与吸附机理

国家自然科学基金

0+阅读 · 2015年12月31日

蓖麻矮化相关RcDof基因功能分析及调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

低能氮离子注入介导沙漠寡营养细菌全基因组突变及定向进化的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

高分子纳米复合材料的界面结构及分子链松弛行为研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

电场对聚合物形态的控制

国家自然科学基金

0+阅读 · 2009年12月31日

高CO2选择性沸石分子筛膜的构筑和结构特性研究

国家自然科学基金

0+阅读 · 2009年12月31日

超临界二氧化碳中温敏型分子印迹聚合物微球的合成及其性能研究

国家自然科学基金

0+阅读 · 2008年12月31日

Cross-Modal Mutual Learning for Cued Speech Recognition

Arxiv

0+阅读 · 2022年12月2日

SoftCorrect: Error Correction with Soft Detection for Automatic Speech Recognition

Arxiv

0+阅读 · 2022年12月2日

The effect of speech pathology on automatic speaker verification -- a large-scale study

Arxiv

0+阅读 · 2022年12月2日

A Benchmark and Asymmetrical-Similarity Learning for Practical Image Copy Detection

Arxiv

0+阅读 · 2022年12月1日

Noisy Label Detection for Speaker Recognition

Arxiv

0+阅读 · 2022年12月1日

Efficient Use of Large Pre-Trained Models for Low Resource ASR

Arxiv

0+阅读 · 2022年11月30日

Sci-Net: a Scale Invariant Model for Buildings Segmentation from Aerial Imagery

Arxiv

0+阅读 · 2022年11月30日

Automatic Discovery of Multi-perspective Process Model using Reinforcement Learning

Arxiv

0+阅读 · 2022年11月30日

Adversarial Robustness of Representation Learning for Knowledge Graphs

Arxiv

10+阅读 · 2022年9月30日

Meta Learning for End-to-End Low-Resource Speech Recognition

Meta Learning for End-to-End Low-Resource Speech Recognition

Arxiv

20+阅读 · 2019年10月26日

VIP会员

文章信息

相关主题

相关VIP内容

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

新型数字杀伤链：理解综合战术网络对野战炮兵体系的能力与效益

《对抗环境中运用数字孪生技术优化预测性维护与后勤保障》2025最新93页

《任务式指挥十六个案例研究》232页

《幻觉还是事实：国防大型语言模型的可信度评估研究》2025最新109页

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Cross-Modal Mutual Learning for Cued Speech Recognition

Arxiv

0+阅读 · 2022年12月2日

SoftCorrect: Error Correction with Soft Detection for Automatic Speech Recognition

Arxiv

0+阅读 · 2022年12月2日

The effect of speech pathology on automatic speaker verification -- a large-scale study

Arxiv

0+阅读 · 2022年12月2日

A Benchmark and Asymmetrical-Similarity Learning for Practical Image Copy Detection

Arxiv

0+阅读 · 2022年12月1日

Noisy Label Detection for Speaker Recognition

Arxiv

0+阅读 · 2022年12月1日

Efficient Use of Large Pre-Trained Models for Low Resource ASR

Arxiv

0+阅读 · 2022年11月30日

Sci-Net: a Scale Invariant Model for Buildings Segmentation from Aerial Imagery

Arxiv

0+阅读 · 2022年11月30日

Automatic Discovery of Multi-perspective Process Model using Reinforcement Learning

Arxiv

0+阅读 · 2022年11月30日

Adversarial Robustness of Representation Learning for Knowledge Graphs

Arxiv

10+阅读 · 2022年9月30日

Meta Learning for End-to-End Low-Resource Speech Recognition

Meta Learning for End-to-End Low-Resource Speech Recognition

Arxiv

20+阅读 · 2019年10月26日

相关基金

易回收磷钼酸铵基高效Cs+捕集纳米复合材料的制备、表征与吸附机理

国家自然科学基金

0+阅读 · 2015年12月31日

蓖麻矮化相关RcDof基因功能分析及调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

低能氮离子注入介导沙漠寡营养细菌全基因组突变及定向进化的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

高分子纳米复合材料的界面结构及分子链松弛行为研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

电场对聚合物形态的控制

国家自然科学基金

0+阅读 · 2009年12月31日

高CO2选择性沸石分子筛膜的构筑和结构特性研究

国家自然科学基金

0+阅读 · 2009年12月31日

超临界二氧化碳中温敏型分子印迹聚合物微球的合成及其性能研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员