Deep Learning optimization involves minimizing a high-dimensional loss function in the weight space which is often perceived as difficult due to its inherent difficulties such as saddle points, local minima, ill-conditioning of the Hessian and limited compute resources. In this paper, we provide a comprehensive review of 12 standard optimization methods successfully used in deep learning research and a theoretical assessment of the difficulties in numerical optimization from the optimization literature.
翻译:深层学习优化意味着最大限度地减少重力空间的高维损耗功能,因为重力空间的固有困难,如马鞍点、当地迷你、赫西安人的不适应和有限的计算资源等,通常被认为是困难的。 在本文件中,我们全面审查了在深层学习研究中成功使用的12种标准优化方法,并从理论上评估了从优化文献中实现数字优化的困难。