通过深学习产生音乐 -- -- 挑战和方向 (Music Generation by Deep Learning - Challenges and Directions)

from arxiv, 17 pages. arXiv admin note: substantial text overlap with arXiv:1709.01620. Accepted for publication in Special Issue on Deep learning for music and audio, Neural Computing & Applications, Springer Nature, 2018

In addition to traditional tasks such as prediction, classification and translation, deep learning is receiving growing attention as an approach for music generation, as witnessed by recent research groups such as Magenta at Google and CTRL (Creator Technology Research Lab) at Spotify. The motivation is in using the capacity of deep learning architectures and training techniques to automatically learn musical styles from arbitrary musical corpora and then to generate samples from the estimated distribution. However, a direct application of deep learning to generate content rapidly reaches limits as the generated content tends to mimic the training set without exhibiting true creativity. Moreover, deep learning architectures do not offer direct ways for controlling generation (e.g., imposing some tonality or other arbitrary constraints). Furthermore, deep learning architectures alone are autistic automata which generate music autonomously without human user interaction, far from the objective of interactively assisting musicians to compose and refine music. Issues such as: control, structure, creativity and interactivity are the focus of our analysis. In this paper, we select some limitations of a direct application of deep learning to music generation, analyze why the issues are not fulfilled and how to address them by possible approaches. Various examples of recent systems are cited as examples of promising directions.

翻译：除了诸如预测、分类和翻译等传统任务外,深层次学习作为音乐制作的一种方法日益受到越来越多的注意,最近的一些研究团体,如谷歌的Magenta和Poctify的CTRL(培养技术研究实验室)等,都可以看到,深层次学习作为音乐制作的一种方法,其动机是利用深层次学习结构和培训技术的能力,从任意的音乐团团体中自动学习音乐风格,然后从估计的发行中提取样本。然而,直接应用深层次学习来生成内容以迅速达到极限,因为所产生的内容往往模仿培训内容而不表现出真正的创造性。此外,深层次学习结构并不直接提供控制创作的方法(例如,施加某些温和性或其他任意限制)。此外,深层次的学习结构本身是自闭自闭自立的自动结构,在没有人类用户互动的情况下自主地产生音乐,远非以互动协助音乐制作和精炼音乐的目标。诸如控制、结构、创造性和互动等问题是我们分析的重点。在本文件中,我们选择了对音乐制作的深度学习的直接应用的一些局限性,分析为什么没有实现,分析问题,如何用最有希望的方式解决它们。