This work studies the robustness certification problem of neural network models, which aims to find certified adversary-free regions as large as possible around data points. In contrast to the existing approaches that seek regions bounded uniformly along all input features, we consider non-uniform bounds and use it to study the decision boundary of neural network models. We formulate our target as an optimization problem with nonlinear constraints. Then, a framework applicable for general feedforward neural networks is proposed to bound the output logits so that the relaxed problem can be solved by the augmented Lagrangian method. Our experiments show the non-uniform bounds have larger volumes than uniform ones. Compared with normal models, the robust models have even larger non-uniform bounds and better interpretability. Further, the geometric similarity of the non-uniform bounds gives a quantitative, data-agnostic metric of input features' robustness.
翻译:这项工作研究神经网络模型的稳健性认证问题,目的是在数据点周围找到尽可能多的经认证的无敌区域。与目前寻求按所有输入特征统一界限的区域的方法相比,我们认为非统一界限,并用它研究神经网络模型的决定界限。我们用非线性限制将我们的目标设计成一个优化问题。然后,提议了一个适用于一般进料向神经网络的框架,以约束输出记录,从而可以通过增强的拉格朗加方法解决宽松问题。我们的实验显示非统一界限的容量大于统一界限。与正常模型相比,强型模型甚至具有更大的非统一界限,而且更便于解释。此外,非统一界限的几何相似性提供了输入特征的强度的定量、数据识别度衡量。