Adversarial examples have been found for various deep as well as shallow learning models, and have at various times been suggested to be either fixable model-specific bugs, or else inherent dataset feature, or both. We present theoretical and empirical results to show that adversarial examples are approximate discontinuities resulting from models that specify approximately bijective maps $f: \Bbb R^n \to \Bbb R^m; n \neq m$ over their inputs, and this discontinuity follows from the topological invariance of dimension.
翻译:对抗性样本已被发现针对各种深度和浅层学习模型,对抗性样本有时被认为是可修复的模型特定错误,也可能是内在的数据集特征,或两者都有。我们提出理论和实证结果表明,对抗性样本是从指定了近似双射映射 $f: \Bbb R^n \to \Bbb R^m; n \neq m$ 的模型中产生的近似不连续性,这种不连续性源于维度的拓扑不变性。