Deep learning frameworks leverage GPUs to perform massively-parallel computations over batches of many training examples efficiently. However, for certain tasks, one may be interested in performing per-example computations, for instance using per-example gradients to evaluate a quantity of interest unique to each example. One notable application comes from the field of differential privacy, where per-example gradients must be norm-bounded in order to limit the impact of each example on the aggregated batch gradient. In this work, we discuss how per-example gradients can be efficiently computed in convolutional neural networks (CNNs). We compare existing strategies by performing a few steps of differentially-private training on CNNs of varying sizes. We also introduce a new strategy for per-example gradient calculation, which is shown to be advantageous depending on the model architecture and how the model is trained. This is a first step in making differentially-private training of CNNs practical.
翻译:深层学习框架利用GPU对大量培训实例进行大规模平行计算,但对于某些任务,人们可能有兴趣进行每个实例的计算,例如使用每个实例的梯度来评估每个实例所独有的兴趣数量。一个显著的应用来自不同隐私领域,每个实例的梯度必须受规范限制,以限制每个实例对汇总的批量梯度的影响。在这项工作中,我们讨论如何在共生神经网络中高效计算每例梯度。我们比较了现有的战略,在有线电视新闻网上进行了几步差别化私营培训。我们还引入了每例梯度计算的新战略,根据模型架构和模式培训情况,这一战略被证明是有利的。这是使有线电视新闻网的差别化私营培训切实可行的第一步。