En Envic Envic Envic Enview 文字到语音合成 (Environment Aware Text-to-Speech Synthesis)

This study aims at designing an environment-aware text-to-speech (TTS) system that can generate speech to suit specific acoustic environments. It is also motivated by the desire to leverage massive data of speech audio from heterogeneous sources in TTS system development. The key idea is to model the acoustic environment in speech audio as a factor of data variability and incorporate it as a condition in the process of neural network based speech synthesis. Two embedding extractors are trained with two purposely constructed datasets for characterization and disentanglement of speaker and environment factors in speech. A neural network model is trained to generate speech from extracted speaker and environment embeddings. Objective and subjective evaluation results demonstrate that the proposed TTS system is able to effectively disentangle speaker and environment factors and synthesize speech audio that carries designated speaker characteristics and environment attribute. Audio samples are available online for demonstration https://daxintan-cuhk.github.io/Environment-Aware-TTS/ .

翻译：这项研究旨在设计一个环境觉识文本到语音系统,能够生成语音,以适应特定的声学环境,其动机还在于希望在TTS系统开发过程中利用来自不同来源的语音音频的大量数据,关键的想法是模拟语音音频中的音频环境,将其作为数据变异性的一个因素,并将它作为神经网络语音合成过程的一个条件。有两个嵌入式提取器经过培训,配有两套专门设计的数据集,用于语音和环境因素的定性和分离。神经网络模型经过培训,从提取的语音和环境嵌入中生成语音。客观和主观的评价结果表明,拟议的TTS系统能够有效地解析演讲者和环境因素,并合成带有指定演讲者特点和环境属性的语音音频。音频样本可在网上查阅,供演示 https://daxintan-cuhk.github.io/Environment-Aware-TTS/。

相关内容

语音合成

关注 0

语音合成（Speech Synthesis），也称为文语转换（Text-to-Speech, TTS,它是将任意的输入文本转换成自然流畅的语音输出。语音合成涉及到人工智能、心理学、声学、语言学、数字信号处理、计算机科学等多个学科技术，是信息处理领域中的一项前沿技术。随着计算机技术的不断提高，语音合成技术从早期的共振峰合成,逐步发展为波形拼接合成和统计参数语音合成，再发展到混合语音合成；合成语音的质量、自然度已经得到明显提高，基本能满足一些特定场合的应用需求。目前，语音合成技术在银行、医院等的信息播报系统、汽车导航系统、自动应答呼叫中心等都有广泛应用，取得了巨大的经济效益。另外，随着智能手机、MP3、PDA 等与我们生活密切相关的媒介的大量涌现，语音合成的应用也在逐渐向娱乐、语音教学、康复治疗等领域深入。可以说语音合成正在影响着人们生活的方方面面。

无监督学习：深度生成模型，35页ppt

专知会员服务

42+阅读 · 2021年7月4日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日