With a direct analysis of neural networks, this paper presents a mathematically tight generalization theory to partially address an open problem regarding the generalization of deep learning. Unlike previous bound-based theory, our main theory is quantitatively as tight as possible for every dataset individually, while producing qualitative insights competitively. Our results give insight into why and how deep learning can generalize well, despite its large capacity, complexity, possible algorithmic instability, nonrobustness, and sharp minima, answering to an open question in the literature. We also discuss limitations of our results and propose additional open problems.
翻译:本文通过对神经网络的直接分析,提出了数学上十分严格的概括理论,部分解决深层学习普遍化的开放问题。 与以往的基于约束的理论不同,我们的主要理论在数量上尽可能紧密,每个数据集都是单个的,同时通过竞争产生质的洞察力。 我们的结果揭示了为什么以及深层次的学习尽管能力巨大、复杂、可能的算法不稳定、非紫色和尖锐的迷你,但又能很好地概括。 我们也讨论了我们结果的局限性,并提出了更多的未决问题。