Understanding what kinds of cooperative structures deep neural networks (DNNs) can represent remains a fundamental yet insufficiently understood problem. In this work, we treat interactions as the fundamental units of such structure and investigate a largely unexplored question: how DNNs encode interactions under different levels of contextual complexity, and how these microscopic interaction patterns shape macroscopic representation capacity. To quantify this complexity, we use multi-order interactions [57], where each order reflects the amount of contextual information required to evaluate the joint interaction utility of a variable pair. This formulation enables a stratified analysis of cooperative patterns learned by DNNs. Building on this formulation, we develop a comprehensive study of interaction structure in DNNs. (i) We empirically discover a universal interaction bottleneck: across architectures and tasks, DNNs easily learn low-order and high-order interactions but consistently under-represent mid-order ones. (ii) We theoretically explain this bottleneck by proving that mid-order interactions incur the highest contextual variability, yielding large gradient variance and making them intrinsically difficult to learn. (iii) We further modulate the bottleneck by introducing losses that steer models toward emphasizing interactions of selected orders. Finally, we connect microscopic interaction structures with macroscopic representational behavior: low-order-emphasized models exhibit stronger generalization and robustness, whereas high-order-emphasized models demonstrate greater structural modeling and fitting capability. Together, these results uncover an inherent representational bias in modern DNNs and establish interaction order as a powerful lens for interpreting and guiding deep representations.
翻译:理解深度神经网络(DNNs)能够表示何种协作结构,仍然是一个基础但尚未被充分理解的问题。在本工作中,我们将交互视为此类结构的基本单元,并研究一个在很大程度上未被探索的问题:DNNs如何在不同的上下文复杂度水平下编码交互,以及这些微观的交互模式如何塑造宏观的表征能力。为了量化这种复杂性,我们使用多阶交互[57],其中每一阶反映了评估一个变量对的联合交互效用所需的上下文信息量。这一表述使得对DNNs学习的协作模式进行分层分析成为可能。基于此表述,我们对DNNs中的交互结构展开了全面研究。(i)我们通过实证发现了一个普遍的交互瓶颈:在不同的架构和任务中,DNNs能够轻松学习低阶和高阶交互,但始终对中阶交互表征不足。(ii)我们从理论上解释了这一瓶颈,证明了中阶交互会引发最高的上下文变异性,从而产生较大的梯度方差,使其本质上难以学习。(iii)我们进一步通过引入引导模型强调特定阶次交互的损失函数来调控该瓶颈。最后,我们将微观的交互结构与宏观的表征行为联系起来:强调低阶交互的模型表现出更强的泛化能力和鲁棒性,而强调高阶交互的模型则展现出更强的结构建模和拟合能力。综上所述,这些结果揭示了现代DNNs中固有的表征偏差,并将交互阶次确立为解释和指导深度表征的一个有力视角。