Semi-supervised Federated Learning (SSFL) has recently drawn much attention due to its practical consideration, i.e., the clients may only have unlabeled data. In practice, these SSFL systems implement semi-supervised training by assigning a "guessed" label to the unlabeled data near the labeled data to convert the unsupervised problem into a fully supervised problem. However, the inherent properties of such semi-supervised training techniques create a new attack surface. In this paper, we discover and reveal a simple yet powerful poisoning attack against SSFL. Our attack utilizes the natural characteristic of semi-supervised learning to cause the model to be poisoned by poisoning unlabeled data. Specifically, the adversary just needs to insert a small number of maliciously crafted unlabeled samples (e.g., only 0.1\% of the dataset) to infect model performance and misclassification. Extensive case studies have shown that our attacks are effective on different datasets and common semi-supervised learning methods. To mitigate the attacks, we propose a defense, i.e., a minimax optimization-based client selection strategy, to enable the server to select the clients who hold the correct label information and high-quality updates. Our defense further employs a quality-based aggregation rule to strengthen the contributions of the selected updates. Evaluations under different attack conditions show that the proposed defense can well alleviate such unlabeled poisoning attacks. Our study unveils the vulnerability of SSFL to unlabeled poisoning attacks and provides the community with potential defense methods.
翻译:半监管的联邦学习组织(SSFL)最近因其实际考虑而引起人们的极大关注,即客户可能只拥有未贴标签的数据。实际上,这些SSFL系统在标签数据附近未贴标签的数据上贴上“猜测”标签,将未贴标签的问题转换成完全监管的问题。然而,这种半监管的培训技术的内在特性创造了一个新的攻击面。在本文中,我们发现并揭示了对SSFL的简单而有力的中毒袭击。我们的攻击利用半监督的学习的自然特征,使该模型受到未贴标签数据的毒害。具体地说,敌人只需要插入少量恶意伪造的未贴标签数据标签的标签标签标签标签(例如,数据集中只有0.1 ⁇ ),以影响模型性能和分类错误。广泛的案例研究表明,我们的攻击对不同的数据集和普通的半监管学习方法是有效的。为了减轻袭击,我们提议了一种非防御性、即降低半监督性袭击的学习特征的特征,从而导致该模型的模型被毒化。一个小型的服务器需要插入少量的未贴心化的标签,以显示我们所选择的高质量的客户选择的规则。