深度神经网络(DNN)是深度学习的一种框架,它是一种具备至少一个隐层的神经网络。与浅层神经网络类似,深度神经网络也能够为复杂非线性系统提供建模,但多出的层次为模型提供了更高的抽象层次,因而提高了模型的能力。

VIP内容

题目:

Confidence-Aware Learning for Deep Neural Networks

简介:

尽管深度神经网络可以执行多种任务,但过分一致的预测问题限制了它们在许多安全关键型应用中的实际应用。已经提出了许多新的工作来减轻这个问题,但是大多数工作需要在训练和/或推理阶段增加计算成本,或者需要定制的体系结构来分别输出置信估计。在本文中,我们提出了一种使用新的损失函数训练深度神经网络的方法,称为正确排名损失,该方法将类别概率显式规范化,以便根据依据的有序等级更好地进行置信估计。所提出的方法易于实现,并且无需进行任何修改即可应用于现有体系结构。而且,它的训练计算成本几乎与传统的深度分类器相同,并且通过一次推断就可以输出可靠的预测。在分类基准数据集上的大量实验结果表明,所提出的方法有助于网络产生排列良好的置信度估计。我们还证明,它对于与置信估计,分布外检测和主动学习密切相关的任务十分有效。

成为VIP会员查看完整内容
0
4

最新论文

We comprehensively reveal the learning dynamics of deep neural networks (DNN) with batch normalization (BN) and weight decay (WD), named as Spherical Motion Dynamics (SMD). Our theorem on SMD is based on the scale-invariant property of weights caused by BN, and regularization effect of WD. SMD shows the optimization trajectory of weights is like a spherical motion; and a new indicator, angular update is proposed to measure the update efficiency of DNN with BN and WD. We rigorously prove that the angular update is only determined by pre-defined hyper-parameters (i.e. learning rate, WD parameter and momentum coefficient), and provide their quantitative relationship. Most importantly, the quantitative result of SMD can perfectly match the empirical observation in complex and large scale computer vision tasks like ImageNet and COCO with standard training schemes. SMD can also yield reasonable interpretations on some phenomena about BN from an entirely new perspective, including avoidance of vanishing and exploding gradient, no risk of being trapped into sharp minima, and sudden drop of loss when shrinking learning rate. Further, to present the practical significance of SMD, we discuss the connection between SMD and commonly used learning rate tuning scheme: Linear Scaling Principle.

0
0
下载
预览
Top