Projected Gradient Descent (PGD) is a strong and widely used first-order adversarial attack, yet its computational cost scales poorly, as all training samples undergo identical iterative inner-loop optimization despite contributing unequally to robustness. Motivated by this inefficiency, we propose \emph{Selective Adversarial Training}, which perturbs only a subset of critical samples in each minibatch. Specifically, we introduce two principled selection criteria: (1) margin-based sampling, which prioritizes samples near the decision boundary, and (2) gradient-matching sampling, which selects samples whose gradients align with the dominant batch optimization direction. Adversarial examples are generated only for the selected subset, while the remaining samples are trained cleanly using a mixed objective. Experiments on MNIST and CIFAR-10 show that the proposed methods achieve robustness comparable to, or even exceeding, full PGD adversarial training, while reducing adversarial computation by up to $50\%$, demonstrating that informed sample selection is sufficient for scalable adversarial robustness.
翻译:投影梯度下降(PGD)是一种强大且广泛应用的一阶对抗攻击方法,但其计算成本随规模增长而急剧上升,因为所有训练样本均需经历相同的迭代内循环优化过程,尽管它们对鲁棒性的贡献并不均衡。受此效率问题的启发,我们提出**选择性对抗训练**,该方法仅扰动每个小批量中关键样本的子集。具体而言,我们引入两种基于原理的选择准则:(1)基于间隔的采样,优先选择决策边界附近的样本;(2)梯度匹配采样,选择其梯度与主导批量优化方向对齐的样本。对抗样本仅针对选定子集生成,而其余样本则通过混合目标进行干净训练。在MNIST和CIFAR-10数据集上的实验表明,所提方法实现的鲁棒性可与完整PGD对抗训练相媲美甚至更优,同时将对抗计算量降低高达$50\%$,这证明基于信息的样本选择足以实现可扩展的对抗鲁棒性。