ERANNs:用于识别音频模式的有效残余音频神经网络 (ERANNs: Efficient Residual Audio Neural Networks for Audio Pattern Recognition)

Audio pattern recognition (APR) is an important research topic and can be applied to several fields related to our lives. Therefore, accurate and efficient APR systems need to be developed as they are useful in real applications. In this paper, we propose a new convolutional neural network (CNN) architecture and a method for improving the inference speed of CNN-based systems for APR tasks. Moreover, using the proposed method, we can improve the performance of our systems, as confirmed in experiments conducted on four audio datasets. In addition, we investigate the impact of data augmentation techniques and transfer learning on the performance of our systems. Our best system achieves a mean average precision (mAP) of 0.450 on the AudioSet dataset. Although this value is less than that of the state-of-the-art system, the proposed system is 7.1x faster and 9.7x smaller. On the ESC-50, UrbanSound8K, and RAVDESS datasets, we obtain state-of-the-art results with accuracies of 0.961, 0.908, and 0.748, respectively. Our system for the ESC-50 dataset is 1.7x faster and 2.3x smaller than the previous best system. For the RAVDESS dataset, our system is 3.3x smaller than the previous best system. We name our systems "Efficient Residual Audio Neural Networks".

翻译：音频模式识别(APR)是一个重要的研究课题,可以适用于与我们生活有关的几个领域。因此,需要开发准确有效的RAPR系统,因为它们在实际应用中有用。在本文件中,我们提议一个新的神经神经网络(CNN)结构,以及改进CNN的系统对PRA任务的推断速度的方法。此外,如在四个音频数据集上进行的实验所证实的,我们还可以改进我们系统的性能。此外,我们调查数据增强技术和传输学习对我们系统绩效的影响。我们的最佳系统在音频Set数据集上实现了0.450的平均平均精确度(MAP),尽管这一价值低于最新系统,但拟议的系统速度为7.1x更快,9.7x较小。在ESC-50、UrbanSound8K和RAVDESS数据集方面,我们获得了最新的结果,我们获得的音频系统为0.961、0.908和0.748i。我们的音频系统比我们以前的RAS-50S-3.3x最佳数据系统要快。

相关内容

Pattern Recognition

关注 986

模式识别是一个成熟的、令人兴奋的、快速发展的领域，它支撑着计算机视觉、图像处理、文本和文档分析以及神经网络等相关领域的发展。它与机器学习非常相似，在生物识别、生物信息学、多媒体数据分析和最新的数据科学等新兴领域也有应用。模式识别（Pattern Recognition）杂志成立于大约50年前，当时该领域刚刚出现计算机科学的早期。在这期间，它已大大扩大。只要这些论文的背景得到了清晰的解释并以模式识别文献为基础，该杂志接受那些对模式识别理论、方法和在任何领域的应用做出原创贡献的论文。官网地址：http://dblp.uni-trier.de/db/conf/par/

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日