Successful deep learning models often involve training neural network architectures that contain more parameters than the number of training samples. Such overparametrized models have been extensively studied in recent years, and the virtues of overparametrization have been established from both the statistical perspective, via the double-descent phenomenon, and the computational perspective via the structural properties of the optimization landscape. Despite the remarkable success of deep learning architectures in the overparametrized regime, it is also well known that these models are highly vulnerable to small adversarial perturbations in their inputs. Even when adversarially trained, their performance on perturbed inputs (robust generalization) is considerably worse than their best attainable performance on benign inputs (standard generalization). It is thus imperative to understand how overparametrization fundamentally affects robustness. In this paper, we will provide a precise characterization of the role of overparametrization on robustness by focusing on random features regression models (two-layer neural networks with random first layer weights). We consider a regime where the sample size, the input dimension and the number of parameters grow in proportion to each other, and derive an asymptotically exact formula for the robust generalization error when the model is adversarially trained. Our developed theory reveals the nontrivial effect of overparametrization on robustness and indicates that for adversarially trained random features models, high overparametrization can hurt robust generalization.
翻译:成功的深层次学习模式往往涉及培训包含比培训样本数量更多的参数的神经网络结构,这类过度平衡模型近年来已经进行了广泛研究,并且从统计角度、双月现象以及通过优化景观结构属性的计算角度,确立了超平衡模型的优点。尽管过度平衡制度中深层次学习结构的显著成功,但众所周知,这些模型在投入中极易受到小规模对立的干扰。即使经过对抗性培训,它们在过张动的投入(粗糙的概括化)方面的性能也大大低于其在良性投入(标准概括化)方面可达到的最佳性能。因此,必须了解超平衡化如何从根本上影响稳健性。在本文件中,我们将通过侧重于随机特征回归模型(具有随机稳健第一层重量的两层神经网络)来准确描述过度平衡的稳健性作用。我们考虑的一个制度是,抽样规模、投入层面和参数的增高压性比重比重,当我们经过训练的高度对称正对准的模型时,我们所制定的高度公式将显示为稳健性。