ASR联合自操作训练前训练 (Joint Encoder-Decoder Self-Supervised Pre-training for ASR) - 专知论文

会员服务 ·

0

SSL · Learning · 语音识别 · Performer · 解码 ·

2022 年 6 月 9 日

Joint Encoder-Decoder Self-Supervised Pre-training for ASR

翻译：ASR联合自操作训练前训练

Arunkumar A,Umesh S

from arxiv, Submitted to Interspeech 2022

Self-supervised learning (SSL) has shown tremendous success in various speech-related downstream tasks, including Automatic Speech Recognition (ASR). The output embeddings of the SSL model are treated as powerful short-time representations of the speech signal. However, in the ASR task, the main objective is to get the correct sequence of acoustic units, characters, or byte-pair encodings (BPEs). Usually, encoder-decoder architecture works exceptionally well for a sequence-to-sequence task like ASR. Therefore, in this paper, we propose a new paradigm that exploits the power of a decoder during self-supervised learning. We use Hidden Unit BERT (HuBERT) SSL framework to compute the conventional masked prediction loss for the encoder. In addition, we have introduced a decoder in the SSL framework and proposed a target preparation strategy for the decoder. Finally, we use a multitask SSL setup wherein we jointly optimize both the encoder and decoder losses. We hypothesize that the presence of a decoder in the SSL model helps it learn an acoustic unit-based language model, which might improve the performance of an ASR downstream task. We compare our proposed SSL model with HuBERT and show up to 25% relative improvement in performance on ASR by finetuning on various LibriSpeech subsets.

翻译：自我监督的学习(SSL)在各种与语言相关的下游任务(包括自动语音识别(ASR))中表现出了巨大的成功。SSL模式的输出嵌入将被视为语音信号的强大短期表示。然而,在ASR的任务中,主要目标是获得音响单元、字符或字节编码的正确序列。通常,编码器-代码器架构在像ASR这样的顺序顺序顺序上运作非常顺利。因此,我们在本文件中提出一种新的模式,在自我监督学习期间利用解码器的能量。我们使用隐藏单位 BERT(HuBERT) SSL 框架来计算编码器的常规掩码预测损失。此外,我们在SLF框架中引入了解码器,并提出了解码器的目标准备战略。最后,我们使用了一个多塔斯克 SSL设置,我们共同优化了编码器和解码器损失。我们假设了在SLSLA系统(SL)下游系统(SL)下游系统(SL)系统(SL)系统(SL)下游系统(SL)下游)系统(SLB)下级(SLV)中,我们可能用SLOB(SLOB)下级(SLOB)系统(SB)系统(SLO(SL)系统(SB)的相对语言)的运行模式)的模型来学习。

0

相关内容

SSL

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

专知会员服务

33+阅读 · 2020年1月5日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

专知

20+阅读 · 2018年6月29日

TRAF3IP3调控T细胞活性与肿瘤免疫的分子机制

国家自然科学基金

0+阅读 · 2016年12月31日

时间分辨共振拉曼光谱用于光感蛋白SRII/HtrII复合物的弱相互作用研究

国家自然科学基金

0+阅读 · 2014年12月31日

PARP-1调控急性肺损伤中中性粒细胞浸润和活化的作用及其分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

M2L2型水溶性金属-药物配合物的定向合成与抗肿瘤活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

血清中“不可见”小分子代谢物定量测定的NMR方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

Wnt3a参与学习记忆调控的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

框架的冗余度

国家自然科学基金

0+阅读 · 2012年12月31日

PTPN18的底物选择性和受调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

Antrum mucosa protein-18（AMP-18）参与胃粘膜上皮细胞癌变的分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

ClC-3氯通道蛋白在肿瘤转移中的功能研究

国家自然科学基金

0+阅读 · 2008年12月31日

Active Learning Strategies for Weakly-supervised Object Detection

Arxiv

0+阅读 · 2022年7月25日

Decoupled Adversarial Contrastive Learning for Self-supervised Adversarial Robustness

Arxiv

0+阅读 · 2022年7月22日

How Well Does Self-Supervised Pre-Training Perform with Streaming Data?

Arxiv

0+阅读 · 2022年7月21日

Graph Self-Supervised Learning: A Survey

Arxiv

15+阅读 · 2021年8月5日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Arxiv

30+阅读 · 2021年7月28日

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Arxiv

18+阅读 · 2021年4月4日

PROP: Pre-training with Representative Words Prediction for Ad-hoc Retrieval

Arxiv

11+阅读 · 2020年10月20日

Pre-training Text Representations as Meta Learning

Arxiv

13+阅读 · 2020年4月12日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training

Arxiv

15+阅读 · 2020年2月28日

VIP会员

文章信息

相关主题

相关VIP内容

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

专知会员服务

33+阅读 · 2020年1月5日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《用计算图变换加速实际工程设计优化》MIT 400页

《支持战术零信任架构实施的自动化零样本数据标记生成式人工智能方法》

如何快速获取数百万架无人机？

“掌控天空：反无人机系统（C-UAS）战略的最佳实践”研讨会13份幻灯片

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

专知

20+阅读 · 2018年6月29日

相关论文

Active Learning Strategies for Weakly-supervised Object Detection

Arxiv

0+阅读 · 2022年7月25日

Decoupled Adversarial Contrastive Learning for Self-supervised Adversarial Robustness

Arxiv

0+阅读 · 2022年7月22日

How Well Does Self-Supervised Pre-Training Perform with Streaming Data?

Arxiv

0+阅读 · 2022年7月21日

Graph Self-Supervised Learning: A Survey

Arxiv

15+阅读 · 2021年8月5日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Arxiv

30+阅读 · 2021年7月28日

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Arxiv

18+阅读 · 2021年4月4日

PROP: Pre-training with Representative Words Prediction for Ad-hoc Retrieval

Arxiv

11+阅读 · 2020年10月20日

Pre-training Text Representations as Meta Learning

Arxiv

13+阅读 · 2020年4月12日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training

Arxiv

15+阅读 · 2020年2月28日

相关基金

TRAF3IP3调控T细胞活性与肿瘤免疫的分子机制

国家自然科学基金

0+阅读 · 2016年12月31日

时间分辨共振拉曼光谱用于光感蛋白SRII/HtrII复合物的弱相互作用研究

国家自然科学基金

0+阅读 · 2014年12月31日

PARP-1调控急性肺损伤中中性粒细胞浸润和活化的作用及其分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

M2L2型水溶性金属-药物配合物的定向合成与抗肿瘤活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

血清中“不可见”小分子代谢物定量测定的NMR方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

Wnt3a参与学习记忆调控的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

框架的冗余度

国家自然科学基金

0+阅读 · 2012年12月31日

PTPN18的底物选择性和受调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

Antrum mucosa protein-18（AMP-18）参与胃粘膜上皮细胞癌变的分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

ClC-3氯通道蛋白在肿瘤转移中的功能研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员