校正 ASR 区划错误的子标题 (Segmenting Subtitles for Correcting ASR Segmentation Errors) - 专知论文

会员服务 ·

0

语音识别 · Performer · MoDELS · 机器翻译 · INFORMS ·

2021 年 4 月 16 日

Segmenting Subtitles for Correcting ASR Segmentation Errors

翻译：校正 ASR 区划错误的子标题

David Wan,Chris Kedzie,Faisal Ladhak,Elsbeth Turcan,Petra Galuščáková,Elena Zotkina,Zhengping Jiang,Peter Bell,Kathleen McKeown

Typical ASR systems segment the input audio into utterances using purely acoustic information, which may not resemble the sentence-like units that are expected by conventional machine translation (MT) systems for Spoken Language Translation. In this work, we propose a model for correcting the acoustic segmentation of ASR models for low-resource languages to improve performance on downstream tasks. We propose the use of subtitles as a proxy dataset for correcting ASR acoustic segmentation, creating synthetic acoustic utterances by modeling common error modes. We train a neural tagging model for correcting ASR acoustic segmentation and show that it improves downstream performance on MT and audio-document cross-language information retrieval (CLIR).

翻译：典型的ASR系统使用纯声学信息,将音频输入语音中,这也许与传统机器翻译系统预期的口语翻译的类似句号单元不同,在这项工作中,我们提出了一个模型,用于纠正低资源语言的ASR模式的声学分解,以提高下游任务的业绩。我们建议使用字幕作为代用数据集,用于纠正ASR声学分解,通过模拟常见错误模式创建合成声学话。我们为纠正ASR声学分解开发了一个神经标记模型,并表明该模型提高了MT和声文件跨语言信息检索的下游性能。

0

相关内容

语音识别

语音识别是计算机科学和计算语言学的一个跨学科子领域，它发展了一些方法和技术，使计算机可以将口语识别和翻译成文本。它也被称为自动语音识别（ASR），计算机语音识别或语音转文本（STT）。它整合了计算机科学，语言学和计算机工程领域的知识和研究。

【EMNLP2020最佳论文】无声语音的数字化发声

【EMNLP2020最佳论文】无声语音的数字化发声

专知会员服务

11+阅读 · 2020年11月20日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

43+阅读 · 2020年11月2日

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

专知会员服务

14+阅读 · 2020年5月5日

【华盛顿大学】预训练语言模型中的潜在名称构件

【华盛顿大学】预训练语言模型中的潜在名称构件

专知会员服务

3+阅读 · 2020年4月6日

【Google】无监督机器翻译，Unsupervised Machine Translation

【Google】无监督机器翻译，Unsupervised Machine Translation

专知会员服务

35+阅读 · 2020年3月3日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

37+阅读 · 2020年1月30日

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

专知会员服务

32+阅读 · 2020年1月5日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

53+阅读 · 2019年10月17日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

52+阅读 · 2019年9月29日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

【泡泡图灵智库】DeepTemporalSeg：具有时间一致性的3D激光雷达语义分割

【泡泡图灵智库】DeepTemporalSeg：具有时间一致性的3D激光雷达语义分割

泡泡机器人SLAM

4+阅读 · 2019年9月5日

已删除

将门创投

3+阅读 · 2019年6月12日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

专知

20+阅读 · 2018年6月29日

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

专知

19+阅读 · 2018年5月31日

【论文推荐】最新6篇机器翻译相关论文—词性和语义标注任务、变分递归神经机器翻译、文学语料、神经后缀预测、重构模型

【论文推荐】最新6篇机器翻译相关论文—词性和语义标注任务、变分递归神经机器翻译、文学语料、神经后缀预测、重构模型

专知

6+阅读 · 2018年1月25日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

EMA2S: An End-to-End Multimodal Articulatory-to-Speech System

Arxiv

0+阅读 · 2021年6月9日

Detection of Lexical Stress Errors in Non-Native (L2) English with Data Augmentation and Attention

Detection of Lexical Stress Errors in Non-Native (L2) English with Data Augmentation and Attention

Arxiv

0+阅读 · 2021年6月7日

Video Instance Segmentation using Inter-Frame Communication Transformers

Arxiv

1+阅读 · 2021年6月7日

Cross-language Sentence Selection via Data Augmentation and Rationale Training

Arxiv

0+阅读 · 2021年6月4日

Contrastive learning of global and local features for medical image segmentation with limited annotations

Arxiv

19+阅读 · 2020年6月18日

Conditional BERT Contextual Augmentation

Conditional BERT Contextual Augmentation

Arxiv

8+阅读 · 2018年12月17日

Improved Speech Enhancement with the Wave-U-Net

Arxiv

8+阅读 · 2018年11月27日

Syllable-Based Sequence-to-Sequence Speech Recognition with the Transformer in Mandarin Chinese

Arxiv

5+阅读 · 2018年6月4日

Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation

Arxiv

5+阅读 · 2018年4月16日

Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task

Arxiv

3+阅读 · 2018年4月16日

VIP会员

文章信息

相关主题

相关VIP内容

【EMNLP2020最佳论文】无声语音的数字化发声

【EMNLP2020最佳论文】无声语音的数字化发声

专知会员服务

11+阅读 · 2020年11月20日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

43+阅读 · 2020年11月2日

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

专知会员服务

14+阅读 · 2020年5月5日

【华盛顿大学】预训练语言模型中的潜在名称构件

【华盛顿大学】预训练语言模型中的潜在名称构件

专知会员服务

3+阅读 · 2020年4月6日

【Google】无监督机器翻译，Unsupervised Machine Translation

【Google】无监督机器翻译，Unsupervised Machine Translation

专知会员服务

35+阅读 · 2020年3月3日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

37+阅读 · 2020年1月30日

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

专知会员服务

32+阅读 · 2020年1月5日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

53+阅读 · 2019年10月17日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

52+阅读 · 2019年9月29日

热门VIP内容

相关资讯

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

【泡泡图灵智库】DeepTemporalSeg：具有时间一致性的3D激光雷达语义分割

【泡泡图灵智库】DeepTemporalSeg：具有时间一致性的3D激光雷达语义分割

泡泡机器人SLAM

4+阅读 · 2019年9月5日

已删除

将门创投

3+阅读 · 2019年6月12日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

专知

20+阅读 · 2018年6月29日

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

专知

19+阅读 · 2018年5月31日

【论文推荐】最新6篇机器翻译相关论文—词性和语义标注任务、变分递归神经机器翻译、文学语料、神经后缀预测、重构模型

【论文推荐】最新6篇机器翻译相关论文—词性和语义标注任务、变分递归神经机器翻译、文学语料、神经后缀预测、重构模型

专知

6+阅读 · 2018年1月25日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

EMA2S: An End-to-End Multimodal Articulatory-to-Speech System

Arxiv

0+阅读 · 2021年6月9日

Detection of Lexical Stress Errors in Non-Native (L2) English with Data Augmentation and Attention

Detection of Lexical Stress Errors in Non-Native (L2) English with Data Augmentation and Attention

Arxiv

0+阅读 · 2021年6月7日

Video Instance Segmentation using Inter-Frame Communication Transformers

Arxiv

1+阅读 · 2021年6月7日

Cross-language Sentence Selection via Data Augmentation and Rationale Training

Arxiv

0+阅读 · 2021年6月4日

Contrastive learning of global and local features for medical image segmentation with limited annotations

Arxiv

19+阅读 · 2020年6月18日

Conditional BERT Contextual Augmentation

Conditional BERT Contextual Augmentation

Arxiv

8+阅读 · 2018年12月17日

Improved Speech Enhancement with the Wave-U-Net

Arxiv

8+阅读 · 2018年11月27日

Syllable-Based Sequence-to-Sequence Speech Recognition with the Transformer in Mandarin Chinese

Arxiv

5+阅读 · 2018年6月4日

Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation

Arxiv

5+阅读 · 2018年4月16日

Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task

Arxiv

3+阅读 · 2018年4月16日

微信扫码咨询专知VIP会员