语音识别是计算机科学和计算语言学的一个跨学科子领域,它发展了一些方法和技术,使计算机可以将口语识别和翻译成文本。 它也被称为自动语音识别(ASR),计算机语音识别或语音转文本(STT)。它整合了计算机科学,语言学和计算机工程领域的知识和研究。

VIP内容

由汤志远、李蓝天、王东组织撰写的《语音识别基本法》一书近日将由电子工业出版社出版。CSLT公众号“清语赋”将顺序刊载该书的全部章节。该书以语音识别为基础任务,介绍了语音识别的 基础原理、主流方法、Kaldi的实现,同时给出若干深入探讨的话题,包括去噪,关键词检出、领域自适应等。最后,该书还对语音识别的相关任务做了总结性介绍,包括说话人识别、语种识别、 情绪识别、语音合成等。该书面向对语音信号处理技术感兴趣的入门级读者。通过该书,读者不仅可以掌握语音识别的基础内容,而且可以了解语音信息处理的相关领域进展,取得实践知识。

地址:

http://cslt.riit.tsinghua.edu.cn/news.php?title=News-2020-07-10-1

成为VIP会员查看完整内容
0
41

最新内容

With the rapid development of deep learning techniques, the popularity of voice services implemented on various Internet of Things (IoT) devices is ever increasing. In this paper, we examine user-level membership inference in the problem space of voice services, by designing an audio auditor to verify whether a specific user had unwillingly contributed audio used to train an automatic speech recognition (ASR) model under strict black-box access. With user representation of the input audio data and their corresponding translated text, our trained auditor is effective in user-level audit. We also observe that the auditor trained on specific data can be generalized well regardless of the ASR model architecture. We validate the auditor on ASR models trained with LSTM, RNNs, and GRU algorithms on two state-of-the-art pipelines, the hybrid ASR system and the end-to-end ASR system. Finally, we conduct a real-world trial of our auditor on iPhone Siri, achieving an overall accuracy exceeding 80\%. We hope the methodology developed in this paper and findings can inform privacy advocates to overhaul IoT privacy.

0
0
下载
预览

最新论文

With the rapid development of deep learning techniques, the popularity of voice services implemented on various Internet of Things (IoT) devices is ever increasing. In this paper, we examine user-level membership inference in the problem space of voice services, by designing an audio auditor to verify whether a specific user had unwillingly contributed audio used to train an automatic speech recognition (ASR) model under strict black-box access. With user representation of the input audio data and their corresponding translated text, our trained auditor is effective in user-level audit. We also observe that the auditor trained on specific data can be generalized well regardless of the ASR model architecture. We validate the auditor on ASR models trained with LSTM, RNNs, and GRU algorithms on two state-of-the-art pipelines, the hybrid ASR system and the end-to-end ASR system. Finally, we conduct a real-world trial of our auditor on iPhone Siri, achieving an overall accuracy exceeding 80\%. We hope the methodology developed in this paper and findings can inform privacy advocates to overhaul IoT privacy.

0
0
下载
预览
Top