蒸馏W2V2:小型和流式Wav2vec 2. 0 基ASR模型</s> (DistillW2V2: A Small and Streaming Wav2vec 2.0 Based ASR Model)

Wav2vec 2.0 (W2V2) has shown impressive performance in automatic speech recognition (ASR). However, the large model size and the non-streaming architecture make it hard to be used under low-resource or streaming scenarios. In this work, we propose a two-stage knowledge distillation method to solve these two problems: the first step is to make the big and non-streaming teacher model smaller, and the second step is to make it streaming. Specially, we adopt the MSE loss for the distillation of hidden layers and the modified LF-MMI loss for the distillation of the prediction layer. Experiments are conducted on Gigaspeech, Librispeech, and an in-house dataset. The results show that the distilled student model (DistillW2V2) we finally get is 8x faster and 12x smaller than the original teacher model. For the 480ms latency setup, the DistillW2V2's relative word error rate (WER) degradation varies from 9% to 23.4% on test sets, which reveals a promising way to extend the W2V2's application scope.

翻译：Wav2vec 2. 0 (W2V2) 在自动语音识别( ASR) 中表现出了令人印象深刻的性能。然而,由于模型大小大且非流式结构结构不流化,因此在资源或流式情景下很难使用。在这项工作中,我们提出了一个两阶段的知识蒸馏方法来解决这两个问题:第一步是缩小大型和非流式教师模式,第二步是使其流化。特别是,我们采用MSE损失来蒸馏隐藏层和修改后的LF-MMI损失来蒸馏预测层。在Gigaspeech、Librispeech和内部数据集中进行了实验。结果显示,蒸馏式学生模式(DistillW2V2)最终比最初的教师模型更快8x和12x小。对于480ms Latency 设置, 蒸馏W2V2的相对单词错误率在测试机组中从9%到23.4%不等,这显示了扩大W2V2应用范围的可行方法。</s>

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

百篇论文纵览大型语言模型最新研究进展

专知会员服务

70+阅读 · 2023年3月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日