We propose the Segmented Full-Song Model (SFS) for symbolic full-song generation. The model accepts a user-provided song structure and an optional short seed segment that anchors the main idea around which the song is developed. By factorizing a song into segments and generating each one through selective attention to related segments, the model achieves higher quality and efficiency compared to prior work. To demonstrate its suitability for human-AI interaction, we further wrap SFS into a web application that enables users to iteratively co-create music on a piano roll with customizable structures and flexible ordering.
翻译:我们提出了一种用于符号化全曲生成的Segmented Full-Song Model (SFS)。该模型接受用户提供的歌曲结构以及一个可选的短种子片段,该片段锚定了歌曲发展的核心主题。通过将歌曲分解为多个段落,并通过对相关段落进行选择性注意力机制来生成每一段,该模型相较于先前工作实现了更高的生成质量与效率。为了展示其在人机交互中的适用性,我们进一步将SFS封装成一个网络应用程序,使用户能够在钢琴卷帘界面上,以可自定义的结构和灵活的排序方式,迭代式地协同创作音乐。