高效和强有力的混合内装混合优化培训方法 (Efficient and Robust Mixed-Integer Optimization Methods for Training Binarized Deep Neural Networks)

Compared to classical deep neural networks its binarized versions can be useful for applications on resource-limited devices due to their reduction in memory consumption and computational demands. In this work we study deep neural networks with binary activation functions and continuous or integer weights (BDNN). We show that the BDNN can be reformulated as a mixed-integer linear program with bounded weight space which can be solved to global optimality by classical mixed-integer programming solvers. Additionally, a local search heuristic is presented to calculate locally optimal networks. Furthermore to improve efficiency we present an iterative data-splitting heuristic which iteratively splits the training set into smaller subsets by using the k-mean method. Afterwards all data points in a given subset are forced to follow the same activation pattern, which leads to a much smaller number of integer variables in the mixed-integer programming formulation and therefore to computational improvements. Finally for the first time a robust model is presented which enforces robustness of the BDNN during training. All methods are tested on random and real datasets and our results indicate that all models can often compete with or even outperform classical DNNs on small network architectures confirming the viability for applications having restricted memory or computing power.

翻译：与古老的深神经网络相比, 其二进制版本的二进制版本可用于在资源有限的设备上的应用, 因为它们减少了内存消耗和计算需求。在这项工作中, 我们研究具有二进制激活功能和连续或整数重量( BDNN)的深神经网络。我们显示, BDNN 可以重塑为混合内线性混合内线性程序, 其内装重量空间可以通过传统的混合内装配编程求解器解决全球最佳性。此外, 提出本地搜索超热, 以计算本地最佳网络。此外, 为了提高效率, 我们展示了一种迭代数据分离的超常性超常性超常性超常性, 通过使用 k- 平均值方法将训练组分成分成分成为较小的子组。之后, 特定组的所有数据点都被迫遵循相同的激活模式, 从而导致混合内装内装式编程配方程式配制中数量较少的整形变量, 从而进行计算改进。最后, 首次提出一个强的模型, 在培训中, 强制实施BDNNNNN。所有方法都通过随机和真实的数据集进行测试, 我们的结果表明, 所有模型都能够和真实的模型都与固定地将所有模型都与固定地进行竞争, 。