Designing neural network architectures is a task that lies somewhere between science and art. For a given task, some architectures are eventually preferred over others, based on a mix of intuition, experience, experimentation and luck. For many tasks, the final word is attributed to the loss function, while for some others a further perceptual evaluation is necessary to assess and compare performance across models. In this paper, we introduce the concept of capacity allocation analysis, with the aim of shedding some light on what network architectures focus their modelling capacity on, when used on a given task. We focus more particularly on spatial capacity allocation, which analyzes a posteriori the effective number of parameters that a given model has allocated for modelling dependencies on a given point or region in the input space, in linear settings. We use this framework to perform a quantitative comparison between some classical architectures on various synthetic tasks. Finally, we consider how capacity allocation might translate in non-linear settings.
翻译:设计神经网络结构是科学和艺术之间的一项任务。 对于一项特定任务,根据直觉、经验、实验和运气的混合,最终偏好于某些结构,而对于许多任务,最后一个词被归结为损失功能,而对于另一些任务,则需要进一步的认知性评估来评估和比较各种模型的性能。在本文件中,我们引入能力分配分析的概念,目的是在某一任务上使用时,揭示网络结构的建模能力侧重于哪些网络结构。我们更侧重于空间能力分配,从事后角度分析某一模型在输入空间、线性环境中根据某个特定点或区域为建模分配的有效参数数量。我们利用这个框架对各种合成任务的一些古典结构进行定量比较。最后,我们考虑在非线性环境中如何将能力配置转化为非线性环境。