用于语音识别的微调 wav2vec2 (Fine-tuning wav2vec2 for speaker recognition) - 专知论文

会员服务 ·

0

声纹识别 · Performer · 损失 · Weight · Softmax ·

2022 年 5 月 6 日

Fine-tuning wav2vec2 for speaker recognition

翻译：用于语音识别的微调 wav2vec2

Nik Vaessen,David A. van Leeuwen

from arxiv, accepted to ICASSP 2022

This paper explores applying the wav2vec2 framework to speaker recognition instead of speech recognition. We study the effectiveness of the pre-trained weights on the speaker recognition task, and how to pool the wav2vec2 output sequence into a fixed-length speaker embedding. To adapt the framework to speaker recognition, we propose a single-utterance classification variant with CE or AAM softmax loss, and an utterance-pair classification variant with BCE loss. Our best performing variant, w2v2-aam, achieves a 1.88% EER on the extended voxceleb1 test set compared to 1.69% EER with an ECAPA-TDNN baseline. Code is available at https://github.com/nikvaessen/w2v2-speaker.

翻译：本文探讨了将 wav2vec2 框架应用于语音识别而不是语音识别。我们研究了预先培训的对语音识别任务重量的有效性,以及如何将 wav2vec2 输出序列整合成固定长度的语音嵌入器。为了调整框架以适应语音识别, 我们提议了一个带有 CE 或 AAM 软麦斯损失的单一通量分类变量, 以及带有 BCE 损失的超量分类变量。我们最好的功能变量 w2v2-aam 在扩展的 voxceleb1 测试组上实现了1.88% EER, 相比之下为1.69% EER, 其基准为 ECAPA-TDNN 。代码可在 https://github. com/nikvaessen/w2v2-speaker 上查阅。

0

相关内容

声纹识别

说话人识别（Speaker Recognition），或者称为声纹识别（Voiceprint Recognition, VPR），是根据语音中所包含的说话人个性信息，利用计算机以及现在的信息识别技术，自动鉴别说话人身份的一种生物特征识别技术。说话人识别研究的目的就是从语音中提取具有说话人表征性的特征，建立有效的模型和系统，实现自动精准的说话人鉴别。

加速图神经网络推理，121页ppt，普林斯顿大学JAVIER DUARTE主讲

加速图神经网络推理，121页ppt，普林斯顿大学JAVIER DUARTE主讲

专知会员服务

33+阅读 · 2022年6月13日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

最新《自然语言处理迁移学习》综述论文，A Survey on Transfer Learning in Natural Language Processing

最新《自然语言处理迁移学习》综述论文，A Survey on Transfer Learning in Natural Language Processing

专知会员服务

139+阅读 · 2020年7月10日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【论文】使用编码器进行命名实体识别（TENER: Adapting Transformer Encoder for Named Entity Recognition）

【论文】使用编码器进行命名实体识别（TENER: Adapting Transformer Encoder for Named Entity Recognition）

专知会员服务

52+阅读 · 2019年12月28日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Parameter-Efficient Fine-tuning 相关工作梳理

Parameter-Efficient Fine-tuning 相关工作梳理

PaperWeekly

1+阅读 · 2022年3月19日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

AINLP

12+阅读 · 2018年11月1日

罗巴代数的表示和罗巴代数在operad中的应用

国家自然科学基金

0+阅读 · 2015年12月31日

SiC MOSFET功率器件高速驱动研究

国家自然科学基金

0+阅读 · 2015年12月31日

Mir124介导柴胡疏肝散调控抑郁症肝郁证模型海马神经可塑性的分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

高k材料/Si衬底界面特性及电子态结构研究

国家自然科学基金

0+阅读 · 2014年12月31日

LPS促进MDSCs扩增和极化的分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

柿叶黄酮类单体对阿尔茨海默病的神经保护作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

小尺寸HfTiO/TaON/GeON堆栈高k栅介质GeOI基MOSFET研究

国家自然科学基金

0+阅读 · 2012年12月31日

长链非编码RNA HOTAIR参与调控t(8;21)+白血病细胞的分化及其机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

microRNAs参与骨髓增生异常综合征发生及白血病转化的分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

磁性Pickering乳液界面流变学研究

国家自然科学基金

0+阅读 · 2008年12月31日

CopyCat2: A Single Model for Multi-Speaker TTS and Many-to-Many Fine-Grained Prosody Transfer

Arxiv

0+阅读 · 2022年6月27日

Why does Self-Supervised Learning for Speech Recognition Benefit Speaker Recognition?

Arxiv

0+阅读 · 2022年6月27日

Extended U-Net for Speaker Verification in Noisy Environments

Arxiv

0+阅读 · 2022年6月27日

Transport-Oriented Feature Aggregation for Speaker Embedding Learning

Arxiv

0+阅读 · 2022年6月26日

On Comparison of Encoders for Attention based End to End Speech Recognition in Standalone and Rescoring Mode

Arxiv

0+阅读 · 2022年6月26日

Confidence Score Based Conformer Speaker Adaptation for Speech Recognition

Arxiv

0+阅读 · 2022年6月24日

End-to-End Text-to-Speech Based on Latent Representation of Speaking Styles Using Spontaneous Dialogue

Arxiv

0+阅读 · 2022年6月24日

Reducing language context confusion for end-to-end code-switching automatic speech recognition

Arxiv

0+阅读 · 2022年6月23日

Towards End-to-End Private Automatic Speaker Recognition

Towards End-to-End Private Automatic Speaker Recognition

Arxiv

0+阅读 · 2022年6月23日

Deep Active Learning for Named Entity Recognition

Arxiv

15+阅读 · 2018年2月4日

VIP会员

文章信息

相关主题

相关VIP内容

加速图神经网络推理，121页ppt，普林斯顿大学JAVIER DUARTE主讲

加速图神经网络推理，121页ppt，普林斯顿大学JAVIER DUARTE主讲

专知会员服务

33+阅读 · 2022年6月13日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

最新《自然语言处理迁移学习》综述论文，A Survey on Transfer Learning in Natural Language Processing

最新《自然语言处理迁移学习》综述论文，A Survey on Transfer Learning in Natural Language Processing

专知会员服务

139+阅读 · 2020年7月10日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【论文】使用编码器进行命名实体识别（TENER: Adapting Transformer Encoder for Named Entity Recognition）

【论文】使用编码器进行命名实体识别（TENER: Adapting Transformer Encoder for Named Entity Recognition）

专知会员服务

52+阅读 · 2019年12月28日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

新型数字杀伤链：理解综合战术网络对野战炮兵体系的能力与效益

《对抗环境中运用数字孪生技术优化预测性维护与后勤保障》2025最新93页

《任务式指挥十六个案例研究》232页

《幻觉还是事实：国防大型语言模型的可信度评估研究》2025最新109页

相关资讯

Parameter-Efficient Fine-tuning 相关工作梳理

Parameter-Efficient Fine-tuning 相关工作梳理

PaperWeekly

1+阅读 · 2022年3月19日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

AINLP

12+阅读 · 2018年11月1日

相关论文

CopyCat2: A Single Model for Multi-Speaker TTS and Many-to-Many Fine-Grained Prosody Transfer

Arxiv

0+阅读 · 2022年6月27日

Why does Self-Supervised Learning for Speech Recognition Benefit Speaker Recognition?

Arxiv

0+阅读 · 2022年6月27日

Extended U-Net for Speaker Verification in Noisy Environments

Arxiv

0+阅读 · 2022年6月27日

Transport-Oriented Feature Aggregation for Speaker Embedding Learning

Arxiv

0+阅读 · 2022年6月26日

On Comparison of Encoders for Attention based End to End Speech Recognition in Standalone and Rescoring Mode

Arxiv

0+阅读 · 2022年6月26日

Confidence Score Based Conformer Speaker Adaptation for Speech Recognition

Arxiv

0+阅读 · 2022年6月24日

End-to-End Text-to-Speech Based on Latent Representation of Speaking Styles Using Spontaneous Dialogue

Arxiv

0+阅读 · 2022年6月24日

Reducing language context confusion for end-to-end code-switching automatic speech recognition

Arxiv

0+阅读 · 2022年6月23日

Towards End-to-End Private Automatic Speaker Recognition

Towards End-to-End Private Automatic Speaker Recognition

Arxiv

0+阅读 · 2022年6月23日

Deep Active Learning for Named Entity Recognition

Arxiv

15+阅读 · 2018年2月4日

相关基金

罗巴代数的表示和罗巴代数在operad中的应用

国家自然科学基金

0+阅读 · 2015年12月31日

SiC MOSFET功率器件高速驱动研究

国家自然科学基金

0+阅读 · 2015年12月31日

Mir124介导柴胡疏肝散调控抑郁症肝郁证模型海马神经可塑性的分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

高k材料/Si衬底界面特性及电子态结构研究

国家自然科学基金

0+阅读 · 2014年12月31日

LPS促进MDSCs扩增和极化的分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

柿叶黄酮类单体对阿尔茨海默病的神经保护作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

小尺寸HfTiO/TaON/GeON堆栈高k栅介质GeOI基MOSFET研究

国家自然科学基金

0+阅读 · 2012年12月31日

长链非编码RNA HOTAIR参与调控t(8;21)+白血病细胞的分化及其机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

microRNAs参与骨髓增生异常综合征发生及白血病转化的分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

磁性Pickering乳液界面流变学研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员