多声流变动语音分离的变换性特别提款权培训标准 (Convolutive Transfer Function Invariant SDR training criteria for Multi-Channel Reverberant Speech Separation) - 专知论文

会员服务 ·

0

分离的 · Performer · 泛函 · 不变 · 卷积 ·

2020 年 11 月 30 日

Convolutive Transfer Function Invariant SDR training criteria for Multi-Channel Reverberant Speech Separation

翻译：多声流变动语音分离的变换性特别提款权培训标准

Christoph Boeddeker,Wangyou Zhang,Tomohiro Nakatani,Keisuke Kinoshita,Tsubasa Ochiai,Marc Delcroix,Naoyuki Kamo,Yanmin Qian,Shinji Watanabe,Reinhold Haeb-Umbach

from arxiv, Submitted to ICASSP 2021

Time-domain training criteria have proven to be very effective for the separation of single-channel non-reverberant speech mixtures. Likewise, mask-based beamforming has shown impressive performance in multi-channel reverberant speech enhancement and source separation. Here, we propose to combine neural network supported multi-channel source separation with a time-domain training objective function. For the objective we propose to use a convolutive transfer function invariant Signal-to-Distortion Ratio (CI-SDR) based loss. While this is a well-known evaluation metric (BSS Eval), it has not been used as a training objective before. To show the effectiveness, we demonstrate the performance on LibriSpeech based reverberant mixtures. On this task, the proposed system approaches the error rate obtained on single-source non-reverberant input, i.e., LibriSpeech test_clean, with a difference of only 1.2 percentage points, thus outperforming a conventional permutation invariant training based system and alternative objectives like Scale Invariant Signal-to-Distortion Ratio by a large margin.

翻译：时间上的培训标准已证明对分离单通道非反动语音混合物非常有效。同样,基于遮罩的波束成型在多通道变动语音增强和源分离中表现出了令人印象深刻的性能。我们在这里提议将神经网络支持的多通道源分离与时间- 部位培训目标功能结合起来。为了实现我们提议的在异端信号对扭曲比率(CI-SDR)基础上损失时使用同流传输功能的目标。虽然这是一个众所周知的评价指标( BSSS Eval),但它以前没有被用作培训目标。要显示效果,我们展示基于 LibriSpeech 的静音混合物的性能。在这项工作中,拟议系统采用单源非静电输入的误差率,即 LibriSpeech 测试纯度,只有1.2个百分点的差,因此比基于常规变异性培训系统和其他目标(如规模变异性信号对流率大的差率率)。

0

相关内容

分离的

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【阿里巴巴-CVPR2020】频域学习，Learning in the Frequency Domain

【阿里巴巴-CVPR2020】频域学习，Learning in the Frequency Domain

专知会员服务

29+阅读 · 2020年3月14日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【Google Research】Wavesplit:通过说话者聚类实现端到端的语音分离，Wavesplit: End-to-End Speech Separation by Speaker Clustering

【Google Research】Wavesplit:通过说话者聚类实现端到端的语音分离，Wavesplit: End-to-End Speech Separation by Speaker Clustering

专知会员服务

19+阅读 · 2020年2月26日

【北邮-腾讯AI】自监督学习音视觉说话人认证，Self-supervised learning for audio-visual speaker diarization

【北邮-腾讯AI】自监督学习音视觉说话人认证，Self-supervised learning for audio-visual speaker diarization

专知会员服务

26+阅读 · 2020年2月16日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

39+阅读 · 2020年1月30日

【BAAI|2019】用深度学习模拟原子间势，王涵（附pdf）

【BAAI|2019】用深度学习模拟原子间势，王涵（附pdf）

专知会员服务

18+阅读 · 2019年11月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【资源】语音增强资源集锦

【资源】语音增强资源集锦

专知

8+阅读 · 2020年7月4日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Deep Compression/Acceleration：模型压缩加速论文汇总

Deep Compression/Acceleration：模型压缩加速论文汇总

极市平台

14+阅读 · 2019年5月15日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

语音顶级会议Interspeech2018接受论文列表！

语音顶级会议Interspeech2018接受论文列表！

专知

6+阅读 · 2018年6月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新5篇语音识别（ASR）相关论文—音频对抗样本、对抗性语音识别系统、声学模型、序列到序列、口语可理解性矫正

【论文推荐】最新5篇语音识别（ASR）相关论文—音频对抗样本、对抗性语音识别系统、声学模型、序列到序列、口语可理解性矫正

专知

14+阅读 · 2018年2月4日

Quartic Perturbation-based Outage-constrained Robust Design in Two-hop One-way Relay Networks

Arxiv

0+阅读 · 2021年1月18日

Luring of transferable adversarial perturbations in the black-box paradigm

Arxiv

0+阅读 · 2021年1月15日

Speech enhancement aided end-to-end multi-task learning for voice activity detection

Arxiv

1+阅读 · 2021年1月15日

A Pragmatic Approach for Hyper-Parameter Tuning in Search-based Test Case Generation

A Pragmatic Approach for Hyper-Parameter Tuning in Search-based Test Case Generation

Arxiv

0+阅读 · 2021年1月14日

WER-BERT: Automatic WER Estimation with BERT in a Balanced Ordinal Classification Paradigm

Arxiv

0+阅读 · 2021年1月14日

WaveTTS: Tacotron-based TTS with Joint Time-Frequency Domain Loss

Arxiv

3+阅读 · 2020年2月2日

Weakly-Supervised Deep Learning for Domain Invariant Sentiment Classification

Arxiv

4+阅读 · 2019年10月29日

Learning latent representations for style control and transfer in end-to-end speech synthesis

Learning latent representations for style control and transfer in end-to-end speech synthesis

Arxiv

5+阅读 · 2019年2月14日

Neural source-filter-based waveform model for statistical parametric speech synthesis

Arxiv

4+阅读 · 2018年11月26日

Tracking by Prediction: A Deep Generative Model for Mutli-Person localisation and Tracking

Arxiv

4+阅读 · 2018年3月9日

VIP会员

文章信息

相关主题

相关VIP内容

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【阿里巴巴-CVPR2020】频域学习，Learning in the Frequency Domain

【阿里巴巴-CVPR2020】频域学习，Learning in the Frequency Domain

专知会员服务

29+阅读 · 2020年3月14日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【Google Research】Wavesplit:通过说话者聚类实现端到端的语音分离，Wavesplit: End-to-End Speech Separation by Speaker Clustering

【Google Research】Wavesplit:通过说话者聚类实现端到端的语音分离，Wavesplit: End-to-End Speech Separation by Speaker Clustering

专知会员服务

19+阅读 · 2020年2月26日

【北邮-腾讯AI】自监督学习音视觉说话人认证，Self-supervised learning for audio-visual speaker diarization

【北邮-腾讯AI】自监督学习音视觉说话人认证，Self-supervised learning for audio-visual speaker diarization

专知会员服务

26+阅读 · 2020年2月16日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

39+阅读 · 2020年1月30日

【BAAI|2019】用深度学习模拟原子间势，王涵（附pdf）

【BAAI|2019】用深度学习模拟原子间势，王涵（附pdf）

专知会员服务

18+阅读 · 2019年11月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

机器人领域中最佳的三维场景表示是什么？——从几何表示到基础模型

《多域作战兵棋推演：运用形态学分析与人工智能加强国防人员训练》

【博士论文】快速高效的归一化流及其在图像生成模型中的应用

仿生机器人技术的军事应用

相关资讯

【资源】语音增强资源集锦

【资源】语音增强资源集锦

专知

8+阅读 · 2020年7月4日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Deep Compression/Acceleration：模型压缩加速论文汇总

Deep Compression/Acceleration：模型压缩加速论文汇总

极市平台

14+阅读 · 2019年5月15日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

语音顶级会议Interspeech2018接受论文列表！

语音顶级会议Interspeech2018接受论文列表！

专知

6+阅读 · 2018年6月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新5篇语音识别（ASR）相关论文—音频对抗样本、对抗性语音识别系统、声学模型、序列到序列、口语可理解性矫正

【论文推荐】最新5篇语音识别（ASR）相关论文—音频对抗样本、对抗性语音识别系统、声学模型、序列到序列、口语可理解性矫正

专知

14+阅读 · 2018年2月4日

相关论文

Quartic Perturbation-based Outage-constrained Robust Design in Two-hop One-way Relay Networks

Arxiv

0+阅读 · 2021年1月18日

Luring of transferable adversarial perturbations in the black-box paradigm

Arxiv

0+阅读 · 2021年1月15日

Speech enhancement aided end-to-end multi-task learning for voice activity detection

Arxiv

1+阅读 · 2021年1月15日

A Pragmatic Approach for Hyper-Parameter Tuning in Search-based Test Case Generation

A Pragmatic Approach for Hyper-Parameter Tuning in Search-based Test Case Generation

Arxiv

0+阅读 · 2021年1月14日

WER-BERT: Automatic WER Estimation with BERT in a Balanced Ordinal Classification Paradigm

Arxiv

0+阅读 · 2021年1月14日

WaveTTS: Tacotron-based TTS with Joint Time-Frequency Domain Loss

Arxiv

3+阅读 · 2020年2月2日

Weakly-Supervised Deep Learning for Domain Invariant Sentiment Classification

Arxiv

4+阅读 · 2019年10月29日

Learning latent representations for style control and transfer in end-to-end speech synthesis

Learning latent representations for style control and transfer in end-to-end speech synthesis

Arxiv

5+阅读 · 2019年2月14日

Neural source-filter-based waveform model for statistical parametric speech synthesis

Arxiv

4+阅读 · 2018年11月26日

Tracking by Prediction: A Deep Generative Model for Mutli-Person localisation and Tracking

Arxiv

4+阅读 · 2018年3月9日

微信扫码咨询专知VIP会员