ILASR:在生产规模上为自动语音识别自动语音识别保留隐私的递增学习 (ILASR: Privacy-Preserving Incremental Learning for Automatic Speech Recognition at Production Scale) - 专知论文

会员服务 ·

0

Learning · 自动语音识别 · 语音识别 · MoDELS · 缩放 ·

2022 年 7 月 22 日

ILASR: Privacy-Preserving Incremental Learning for Automatic Speech Recognition at Production Scale

翻译：ILASR:在生产规模上为自动语音识别自动语音识别保留隐私的递增学习

Gopinath Chennupati,Milind Rao,Gurpreet Chadha,Aaron Eakin,Anirudh Raju,Gautam Tiwari,Anit Kumar Sahu,Ariya Rastrow,Jasha Droppo,Andy Oberlin,Buddha Nandanoor,Prahalad Venkataramanan,Zheng Wu,Pankaj Sitpure

from arxiv, 9 pages

Incremental learning is one paradigm to enable model building and updating at scale with streaming data. For end-to-end automatic speech recognition (ASR) tasks, the absence of human annotated labels along with the need for privacy preserving policies for model building makes it a daunting challenge. Motivated by these challenges, in this paper we use a cloud based framework for production systems to demonstrate insights from privacy preserving incremental learning for automatic speech recognition (ILASR). By privacy preserving, we mean, usage of ephemeral data which are not human annotated. This system is a step forward for production levelASR models for incremental/continual learning that offers near real-time test-bed for experimentation in the cloud for end-to-end ASR, while adhering to privacy-preserving policies. We show that the proposed system can improve the production models significantly(3%) over a new time period of six months even in the absence of human annotated labels with varying levels of weak supervision and large batch sizes in incremental learning. This improvement is 20% over test sets with new words and phrases in the new time period. We demonstrate the effectiveness of model building in a privacy-preserving incremental fashion for ASR while further exploring the utility of having an effective teacher model and use of large batch sizes.

翻译：递增学习是一种范例,可以使建模和升级规模与流数据相适应。对于终端到终端自动语音识别(ASR)任务而言,缺少人文附加说明的标签,加上需要为建模提供隐私保护政策,这带来了巨大的挑战。受这些挑战的驱使,在本文件中,我们使用基于云的生产系统框架,以展示从隐私保护递增学习到自动语音识别(ILASR)的洞察力。通过隐私保护,我们是指使用非人为附加说明的短时间数据。这个系统是制作ASR增量/连续学习模型的向前迈出的一步,该模型为终端到终端自动语音识别(ASR)的云层实验提供了近实时测试台,同时坚持隐私保护政策。我们表明,拟议的系统可以在6个月的新时期内大大改进生产模式(3%),即使没有具有不同程度的监管和渐进规模的人类附加说明标签。在新的时间段里,这一改进了20 %的测试组,并带有新的词和短语。我们展示了建模的实效,同时进一步探索了空间建设模型的效用,同时进一步探索了大规模使用。

0

相关内容

Learning

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

专知会员服务

34+阅读 · 2022年3月5日

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

lncRNA DATOC1影响microRNA成熟促进卵巢癌转移的分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

LncRNA介导肿瘤相关巨噬细胞促进乳腺癌转移分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

PKCα与UNC5B相互作用调控膀胱癌细胞药物敏感性的分子机制

国家自然科学基金

0+阅读 · 2013年12月31日

lincRNA-RP11-564C4.6/microRNA-195回路调控肝癌转移的机制和意义

国家自然科学基金

0+阅读 · 2013年12月31日

恶性肿瘤风险lncRNA多态位点识别及其功能注释研究

国家自然科学基金

0+阅读 · 2013年12月31日

冰城链霉菌米尔贝霉素生物合成调控基因-milR作用的分子机制

国家自然科学基金

0+阅读 · 2013年12月31日

CRMP4转录调控与前列腺癌转移机理的研究

国家自然科学基金

0+阅读 · 2012年12月31日

LIMK1：罗格列酮抑制人胃癌细胞增殖、迁移及侵袭的作用靶点

国家自然科学基金

0+阅读 · 2012年12月31日

Hedgehog信号通路调控宫颈癌上皮间质转化的作用及机制

国家自然科学基金

0+阅读 · 2011年12月31日

乙型肝炎病毒与宿主细胞相互作用的三维组学分析

国家自然科学基金

0+阅读 · 2009年12月31日

A Transferable and Automatic Tuning of Deep Reinforcement Learning for Cost Effective Phishing Detection

A Transferable and Automatic Tuning of Deep Reinforcement Learning for Cost Effective Phishing Detection

Arxiv

0+阅读 · 2022年9月19日

Class-Incremental Continual Learning into the eXtended DER-verse

Arxiv

0+阅读 · 2022年9月19日

Federated Coordinate Descent for Privacy-Preserving Multiparty Linear Regression

Arxiv

0+阅读 · 2022年9月19日

Active Learning for Domain Adaptation: An Energy-based Approach

Arxiv

13+阅读 · 2021年12月2日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Arxiv

30+阅读 · 2021年7月28日

Privacy and Robustness in Federated Learning: Attacks and Defenses

Arxiv

35+阅读 · 2020年12月7日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

Meta Learning for End-to-End Low-Resource Speech Recognition

Meta Learning for End-to-End Low-Resource Speech Recognition

Arxiv

20+阅读 · 2019年10月26日

Label-aware Double Transfer Learning for Cross-Specialty Medical Named Entity Recognition

Arxiv

10+阅读 · 2018年4月28日

CommanderSong: A Systematic Approach for Practical Adversarial Voice Recognition

Arxiv

14+阅读 · 2018年1月24日

VIP会员

文章信息

相关主题

自动语音识别

相关VIP内容

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

专知会员服务

34+阅读 · 2022年3月5日

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《美空军条令出版物：战略打击》最新条令

《高能激光武器》22页slides

军事前沿模型

《面向小型无人机或无人飞行器的创新雷达探测与人工智能分类技术》263页

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

A Transferable and Automatic Tuning of Deep Reinforcement Learning for Cost Effective Phishing Detection

A Transferable and Automatic Tuning of Deep Reinforcement Learning for Cost Effective Phishing Detection

Arxiv

0+阅读 · 2022年9月19日

Class-Incremental Continual Learning into the eXtended DER-verse

Arxiv

0+阅读 · 2022年9月19日

Federated Coordinate Descent for Privacy-Preserving Multiparty Linear Regression

Arxiv

0+阅读 · 2022年9月19日

Active Learning for Domain Adaptation: An Energy-based Approach

Arxiv

13+阅读 · 2021年12月2日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Arxiv

30+阅读 · 2021年7月28日

Privacy and Robustness in Federated Learning: Attacks and Defenses

Arxiv

35+阅读 · 2020年12月7日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

Meta Learning for End-to-End Low-Resource Speech Recognition

Meta Learning for End-to-End Low-Resource Speech Recognition

Arxiv

20+阅读 · 2019年10月26日

Label-aware Double Transfer Learning for Cross-Specialty Medical Named Entity Recognition

Arxiv

10+阅读 · 2018年4月28日

CommanderSong: A Systematic Approach for Practical Adversarial Voice Recognition

Arxiv

14+阅读 · 2018年1月24日

相关基金

lncRNA DATOC1影响microRNA成熟促进卵巢癌转移的分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

LncRNA介导肿瘤相关巨噬细胞促进乳腺癌转移分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

PKCα与UNC5B相互作用调控膀胱癌细胞药物敏感性的分子机制

国家自然科学基金

0+阅读 · 2013年12月31日

lincRNA-RP11-564C4.6/microRNA-195回路调控肝癌转移的机制和意义

国家自然科学基金

0+阅读 · 2013年12月31日

恶性肿瘤风险lncRNA多态位点识别及其功能注释研究

国家自然科学基金

0+阅读 · 2013年12月31日

冰城链霉菌米尔贝霉素生物合成调控基因-milR作用的分子机制

国家自然科学基金

0+阅读 · 2013年12月31日

CRMP4转录调控与前列腺癌转移机理的研究

国家自然科学基金

0+阅读 · 2012年12月31日

LIMK1：罗格列酮抑制人胃癌细胞增殖、迁移及侵袭的作用靶点

国家自然科学基金

0+阅读 · 2012年12月31日

Hedgehog信号通路调控宫颈癌上皮间质转化的作用及机制

国家自然科学基金

0+阅读 · 2011年12月31日

乙型肝炎病毒与宿主细胞相互作用的三维组学分析

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员