合作学习中可行后门防御 (On Provable Backdoor Defense in Collaborative Learning)

As collaborative learning allows joint training of a model using multiple sources of data, the security problem has been a central concern. Malicious users can upload poisoned data to prevent the model's convergence or inject hidden backdoors. The so-called backdoor attacks are especially difficult to detect since the model behaves normally on standard test data but gives wrong outputs when triggered by certain backdoor keys. Although Byzantine-tolerant training algorithms provide convergence guarantee, provable defense against backdoor attacks remains largely unsolved. Methods based on randomized smoothing can only correct a small number of corrupted pixels or labels; methods based on subset aggregation cause a severe drop in classification accuracy due to low data utilization. We propose a novel framework that generalizes existing subset aggregation methods. The framework shows that the subset selection process, a deciding factor for subset aggregation methods, can be viewed as a code design problem. We derive the theoretical bound of data utilization ratio and provide optimal code construction. Experiments on non-IID versions of MNIST and CIFAR-10 show that our method with optimal codes significantly outperforms baselines using non-overlapping partition and random selection. Additionally, integration with existing coding theory results shows that special codes can track the location of the attackers. Such capability provides new countermeasures to backdoor attacks.

翻译：由于合作学习使利用多种数据来源对模型进行联合培训成为了合作学习,安全问题一直是一个中心问题。恶意用户可以上传有毒数据,以防止模型的趋同或输入隐藏的后门。所谓的后门攻击特别难以检测,因为模型通常在标准测试数据上表现正常,但在某些后门钥匙触发时却产生错误产出。虽然Byzantine耐受性培训算法提供了趋同的保证,但是对后门攻击的可辨防御性基本上仍未解决。基于随机平滑的方法只能纠正少数腐败的像素或标签;基于子集的方法导致分类精确度因数据利用率低而严重下降。我们提出了一个新的框架,将现有的子集成选择程序作为子集成方法的决定因素,可被视为一个代码设计问题。我们从理论角度出发,数据利用比率并提供最佳代码构建。对非IID版本的MNIST和CIFAR-10的实验显示,我们使用最佳代码的方法大大超出使用非重叠分区分区隔断和随机选择目标的基线。这个框架显示现有组合选择程序能够向攻击者提供新的轨道定位。