推动非后退言论承认的限度 (Pushing the Limits of Non-Autoregressive Speech Recognition) - 专知论文

会员服务 ·

0

语音识别 · Conformer · 语言模型化 · state-of-the-art · Neural Networks ·

2021 年 4 月 7 日

Pushing the Limits of Non-Autoregressive Speech Recognition

翻译：推动非后退言论承认的限度

Edwin G. Ng,Chung-Cheng Chiu,Yu Zhang,William Chan

from arxiv, Submitted to Interspeech 2021

We combine recent advancements in end-to-end speech recognition to non-autoregressive automatic speech recognition. We push the limits of non-autoregressive state-of-the-art results for multiple datasets: LibriSpeech, Fisher+Switchboard and Wall Street Journal. Key to our recipe, we leverage CTC on giant Conformer neural network architectures with SpecAugment and wav2vec2 pre-training. We achieve 1.8%/3.6% WER on LibriSpeech test/test-other sets, 5.1%/9.8% WER on Switchboard, and 3.4% on the Wall Street Journal, all without a language model.

翻译：我们把端到端语音识别的最新进展与非偏向自动语音识别结合起来。我们推出多个数据集的非偏向状态结果限制: LibriSpeech, Fisher+Switchboard 和 Wall Street Journal。我们的食谱关键是,我们利用CTC 和 SpecAugment 和 wav2vec2 培训前的巨型神经网络结构。我们在LibriSpeech 测试/其他数据集上实现了1.8%/3.6%的WER,在交换机上实现了5.1%/9.8%的WER,在华尔街日报上实现了3.4%的WER, 全部没有语言模式。

0

相关内容

语音识别

语音识别是计算机科学和计算语言学的一个跨学科子领域，它发展了一些方法和技术，使计算机可以将口语识别和翻译成文本。它也被称为自动语音识别（ASR），计算机语音识别或语音转文本（STT）。它整合了计算机科学，语言学和计算机工程领域的知识和研究。

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

43+阅读 · 2020年11月2日

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

专知会员服务

14+阅读 · 2020年5月5日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

59+阅读 · 2020年3月19日

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

专知会员服务

32+阅读 · 2020年1月5日

【Python Tricks新书】The book: A Buffet of Awesome Python Features，299页pdf

【Python Tricks新书】The book: A Buffet of Awesome Python Features，299页pdf

专知会员服务

43+阅读 · 2020年1月1日

【ICLR2020】理解非自回归机器翻译中的知识蒸馏（Understanding Knowledge Distillation in Non-autoregressive Machine Translation）

【ICLR2020】理解非自回归机器翻译中的知识蒸馏（Understanding Knowledge Distillation in Non-autoregressive Machine Translation）

专知会员服务

10+阅读 · 2019年12月28日

【金融机器学习课程资料】Financial Machine Learning

专知会员服务

112+阅读 · 2019年12月24日

【综述】文献级机器翻译研究:方法与评价（A Survey on Document-level Machine Translation: Methods and Evaluation）

【综述】文献级机器翻译研究:方法与评价（A Survey on Document-level Machine Translation: Methods and Evaluation）

专知会员服务

6+阅读 · 2019年12月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

52+阅读 · 2019年10月17日

【资源】语音增强资源集锦

【资源】语音增强资源集锦

专知

8+阅读 · 2020年7月4日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

39+阅读 · 2019年6月9日

语音顶级会议Interspeech2018接受论文列表！

语音顶级会议Interspeech2018接受论文列表！

专知

6+阅读 · 2018年6月10日

【论文推荐】最新五篇命名实体识别相关论文—深度主动学习、Lattice LSTM、混合马尔可夫CRF

【论文推荐】最新五篇命名实体识别相关论文—深度主动学习、Lattice LSTM、混合马尔可夫CRF

专知

26+阅读 · 2018年5月22日

胶囊网络资源汇总

胶囊网络资源汇总

论智

7+阅读 · 2018年3月10日

开源自动语音识别系统wav2letter (附实现教程)

开源自动语音识别系统wav2letter (附实现教程)

七月在线实验室

9+阅读 · 2018年1月8日

干货 | 深度学习论文汇总

干货 | 深度学习论文汇总

AI科技评论

4+阅读 · 2018年1月1日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Rejuvenating Low-Frequency Words: Making the Most of Parallel Data in Non-Autoregressive Translation

Arxiv

0+阅读 · 2021年6月2日

Fast End-to-End Speech Recognition via Non-Autoregressive Models and Cross-Modal Knowledge Transferring from BERT

Arxiv

0+阅读 · 2021年5月31日

Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Arxiv

10+阅读 · 2020年12月31日

Transformer based Grapheme-to-Phoneme Conversion

Arxiv

6+阅读 · 2020年4月14日

End-to-End Multi-speaker Speech Recognition with Transformer

Arxiv

8+阅读 · 2020年2月13日

SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

Arxiv

7+阅读 · 2019年4月18日

Improved Speech Enhancement with the Wave-U-Net

Arxiv

8+阅读 · 2018年11月27日

End-to-End Speech Recognition From the Raw Waveform

Arxiv

3+阅读 · 2018年6月19日

Syllable-Based Sequence-to-Sequence Speech Recognition with the Transformer in Mandarin Chinese

Arxiv

5+阅读 · 2018年6月4日

State-of-the-art Speech Recognition With Sequence-to-Sequence Models

Arxiv

7+阅读 · 2018年1月18日

VIP会员

文章信息

相关主题

语言模型化

state-of-the-art

Neural Networks

相关VIP内容

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

43+阅读 · 2020年11月2日

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

专知会员服务

14+阅读 · 2020年5月5日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

59+阅读 · 2020年3月19日

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

专知会员服务

32+阅读 · 2020年1月5日

【Python Tricks新书】The book: A Buffet of Awesome Python Features，299页pdf

【Python Tricks新书】The book: A Buffet of Awesome Python Features，299页pdf

专知会员服务

43+阅读 · 2020年1月1日

【ICLR2020】理解非自回归机器翻译中的知识蒸馏（Understanding Knowledge Distillation in Non-autoregressive Machine Translation）

【ICLR2020】理解非自回归机器翻译中的知识蒸馏（Understanding Knowledge Distillation in Non-autoregressive Machine Translation）

专知会员服务

10+阅读 · 2019年12月28日

【金融机器学习课程资料】Financial Machine Learning

专知会员服务

112+阅读 · 2019年12月24日

【综述】文献级机器翻译研究:方法与评价（A Survey on Document-level Machine Translation: Methods and Evaluation）

【综述】文献级机器翻译研究:方法与评价（A Survey on Document-level Machine Translation: Methods and Evaluation）

专知会员服务

6+阅读 · 2019年12月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

52+阅读 · 2019年10月17日

热门VIP内容

相关资讯

【资源】语音增强资源集锦

【资源】语音增强资源集锦

专知

8+阅读 · 2020年7月4日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

39+阅读 · 2019年6月9日

语音顶级会议Interspeech2018接受论文列表！

语音顶级会议Interspeech2018接受论文列表！

专知

6+阅读 · 2018年6月10日

【论文推荐】最新五篇命名实体识别相关论文—深度主动学习、Lattice LSTM、混合马尔可夫CRF

【论文推荐】最新五篇命名实体识别相关论文—深度主动学习、Lattice LSTM、混合马尔可夫CRF

专知

26+阅读 · 2018年5月22日

胶囊网络资源汇总

胶囊网络资源汇总

论智

7+阅读 · 2018年3月10日

开源自动语音识别系统wav2letter (附实现教程)

开源自动语音识别系统wav2letter (附实现教程)

七月在线实验室

9+阅读 · 2018年1月8日

干货 | 深度学习论文汇总

干货 | 深度学习论文汇总

AI科技评论

4+阅读 · 2018年1月1日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Rejuvenating Low-Frequency Words: Making the Most of Parallel Data in Non-Autoregressive Translation

Arxiv

0+阅读 · 2021年6月2日

Fast End-to-End Speech Recognition via Non-Autoregressive Models and Cross-Modal Knowledge Transferring from BERT

Arxiv

0+阅读 · 2021年5月31日

Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Arxiv

10+阅读 · 2020年12月31日

Transformer based Grapheme-to-Phoneme Conversion

Arxiv

6+阅读 · 2020年4月14日

End-to-End Multi-speaker Speech Recognition with Transformer

Arxiv

8+阅读 · 2020年2月13日

SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

Arxiv

7+阅读 · 2019年4月18日

Improved Speech Enhancement with the Wave-U-Net

Arxiv

8+阅读 · 2018年11月27日

End-to-End Speech Recognition From the Raw Waveform

Arxiv

3+阅读 · 2018年6月19日

Syllable-Based Sequence-to-Sequence Speech Recognition with the Transformer in Mandarin Chinese

Arxiv

5+阅读 · 2018年6月4日

State-of-the-art Speech Recognition With Sequence-to-Sequence Models

Arxiv

7+阅读 · 2018年1月18日

微信扫码咨询专知VIP会员