Building on the deep learning based acoustic echo cancellation (AEC) in the single-loudspeaker (single-channel) and single-microphone setup, this paper investigates multi-channel AEC (MCAEC) and multi-microphone AEC (MMAEC). We train a deep neural network (DNN) to predict the near-end speech from microphone signals with far-end signals used as additional information. We find that the deep learning approach avoids the non-uniqueness problem in traditional MCAEC algorithms. For the AEC setup with multiple microphones, rather than employing AEC for each microphone, a single DNN is trained to achieve echo removal for all microphones. Also, combining deep learning based AEC with deep learning based beamforming further improves the system performance. Experimental results show the effectiveness of both bidirectional long short-term memory (BLSTM) and convolutional recurrent network (CRN) based methods for MCAEC and MMAEC. Furthermore, deep learning based methods are capable of removing echo and noise simultaneously and work well in the presence of nonlinear distortions.
翻译:在单声频和单声频装置中基于深学习的声回声取消(AEC)的基础上,本文件对多声频AEC(MCAEC)和多声频AEC(MMAEC)进行了调查。我们训练了一个深神经网络(DNN),用远端信号预测麦克风信号的近端语音,并将远端信号用作补充信息。我们发现,深识方法避免了传统MCACEC算法中的非独断问题。对于安装多麦克风的AEC,而不是对每部麦克风使用AEC,单发单发DNNN受过培训,以实现所有麦克风的回声删除。此外,将基于深度学习的AEC与基于深度学习的波束化相结合,进一步提高了系统性能。实验结果显示,基于双向长短期内存(BLSTM)和远端经常网络(CRN)的方法对于MCACEC和MAEC都是有效的。此外,基于深识方法能够同时消除回音和噪音,在出现非线扭曲时工作良好。