为什么自我支持的语音确认学习会给发言人带来什么好处呢? (Why does Self-Supervised Learning for Speech Recognition Benefit Speaker Recognition?) - 专知论文

会员服务 ·

0

Learning · 声纹识别 · SSL · 语音识别 · Performer ·

2022 年 6 月 27 日

Why does Self-Supervised Learning for Speech Recognition Benefit Speaker Recognition?

翻译：为什么自我支持的语音确认学习会给发言人带来什么好处呢?

Sanyuan Chen,Yu Wu,Chengyi Wang,Shujie Liu,Zhuo Chen,Peidong Wang,Gang Liu,Jinyu Li,Jian Wu,Xiangzhan Yu,Furu Wei

from arxiv, Accepted by INTERSPEECH 2022

Recently, self-supervised learning (SSL) has demonstrated strong performance in speaker recognition, even if the pre-training objective is designed for speech recognition. In this paper, we study which factor leads to the success of self-supervised learning on speaker-related tasks, e.g. speaker verification (SV), through a series of carefully designed experiments. Our empirical results on the Voxceleb-1 dataset suggest that the benefit of SSL to SV task is from a combination of mask speech prediction loss, data scale, and model size, while the SSL quantizer has a minor impact. We further employ the integrated gradients attribution method and loss landscape visualization to understand the effectiveness of self-supervised learning for speaker recognition performance.

翻译：最近,自我监督的学习(SSL)在语音识别方面表现良好,即使培训前的目标是为语音识别设计的,我们也在本文中研究,通过一系列精心设计的实验,使自我监督的语音相关任务(如语音校验(SV))学习取得成功的因素是什么。我们在Voxceleb-1数据集上的经验结果表明,SSL对SV任务的好处在于将面具语音预测损失、数据规模和模型大小结合起来,而SSL量化工具的影响较小。我们还进一步采用综合梯度归属法和损失景观可视化来理解自我监督的语音识别学习效果的效果。

0

相关内容

Learning

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

161+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

新疆北部早春短命植物独行菜种子低温萌发停滞的调控机制

国家自然科学基金

0+阅读 · 2014年12月31日

微放电的击穿特性和放电模式转换

国家自然科学基金

0+阅读 · 2014年12月31日

中温固体氧化物燃料电池LSCF阴极衰减机理及提高稳定性研究

国家自然科学基金

0+阅读 · 2013年12月31日

MKP-4调节ERK信号通路在肝细胞癌发生发展中的意义

国家自然科学基金

0+阅读 · 2013年12月31日

含油气盆地地表典型烃蚀变高光谱遥感响应机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

ADAMTS2和ADAMTS4基因对牛前脂肪细胞外基质重构及肌内脂肪沉积的调控研究

国家自然科学基金

0+阅读 · 2013年12月31日

BaTiO3基铁电陶瓷的异常介电非线性及其调控机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

近红外波段导电氧化物等离子体材料的光学性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

单晶基体界面反应及其对微细无铅焊点可靠性的影响

国家自然科学基金

0+阅读 · 2011年12月31日

晶态桥联聚倍半硅氧烷的自导向组装（self-directed assembly）及其发光性能

国家自然科学基金

0+阅读 · 2011年12月31日

Investigate the Essence of Long-Tailed Recognition from a Unified Perspective

Arxiv

0+阅读 · 2022年8月17日

Self-supervised Implicit Glyph Attention for Text Recognition

Arxiv

0+阅读 · 2022年8月16日

A Physical-World Adversarial Attack for 3D Face Recognition

Arxiv

0+阅读 · 2022年8月16日

C3-DINO: Joint Contrastive and Non-contrastive Self-Supervised Learning for Speaker Verification

Arxiv

0+阅读 · 2022年8月15日

Graph Self-Supervised Learning: A Survey

Arxiv

15+阅读 · 2021年8月5日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

Evolving Losses for Unsupervised Video Representation Learning

Arxiv

23+阅读 · 2020年2月26日

Knowledge Graph Transfer Network for Few-Shot Recognition

Arxiv

15+阅读 · 2019年11月21日

Meta Learning for End-to-End Low-Resource Speech Recognition

Meta Learning for End-to-End Low-Resource Speech Recognition

Arxiv

20+阅读 · 2019年10月26日

Deep Face Recognition: A Survey

Deep Face Recognition: A Survey

Arxiv

18+阅读 · 2019年2月12日

VIP会员

文章信息

相关主题

相关VIP内容

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

161+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【EMNLP2025最佳论文】INFINI-GRAM MINI：基于 FM-Index 的互联网级精确 n-gram 搜索

【EMNLP2025教程】高效的大语言模型推理：算法、模型与系统，203页ppt

AI医疗行业研究报告：AI医疗前景广阔

【斯坦福博士论文】多模态基础模型：从科学理解到科学发现

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Investigate the Essence of Long-Tailed Recognition from a Unified Perspective

Arxiv

0+阅读 · 2022年8月17日

Self-supervised Implicit Glyph Attention for Text Recognition

Arxiv

0+阅读 · 2022年8月16日

A Physical-World Adversarial Attack for 3D Face Recognition

Arxiv

0+阅读 · 2022年8月16日

C3-DINO: Joint Contrastive and Non-contrastive Self-Supervised Learning for Speaker Verification

Arxiv

0+阅读 · 2022年8月15日

Graph Self-Supervised Learning: A Survey

Arxiv

15+阅读 · 2021年8月5日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

Evolving Losses for Unsupervised Video Representation Learning

Arxiv

23+阅读 · 2020年2月26日

Knowledge Graph Transfer Network for Few-Shot Recognition

Arxiv

15+阅读 · 2019年11月21日

Meta Learning for End-to-End Low-Resource Speech Recognition

Meta Learning for End-to-End Low-Resource Speech Recognition

Arxiv

20+阅读 · 2019年10月26日

Deep Face Recognition: A Survey

Deep Face Recognition: A Survey

Arxiv

18+阅读 · 2019年2月12日

相关基金

新疆北部早春短命植物独行菜种子低温萌发停滞的调控机制

国家自然科学基金

0+阅读 · 2014年12月31日

微放电的击穿特性和放电模式转换

国家自然科学基金

0+阅读 · 2014年12月31日

中温固体氧化物燃料电池LSCF阴极衰减机理及提高稳定性研究

国家自然科学基金

0+阅读 · 2013年12月31日

MKP-4调节ERK信号通路在肝细胞癌发生发展中的意义

国家自然科学基金

0+阅读 · 2013年12月31日

含油气盆地地表典型烃蚀变高光谱遥感响应机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

ADAMTS2和ADAMTS4基因对牛前脂肪细胞外基质重构及肌内脂肪沉积的调控研究

国家自然科学基金

0+阅读 · 2013年12月31日

BaTiO3基铁电陶瓷的异常介电非线性及其调控机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

近红外波段导电氧化物等离子体材料的光学性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

单晶基体界面反应及其对微细无铅焊点可靠性的影响

国家自然科学基金

0+阅读 · 2011年12月31日

晶态桥联聚倍半硅氧烷的自导向组装（self-directed assembly）及其发光性能

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员