Data privacy has become an increasingly important issue in machine learning. Many approaches have been developed to tackle this issue, e.g., cryptography (Homomorphic Encryption, Differential Privacy, etc.) and collaborative training (Secure Multi-Party Computation, Distributed Learning and Federated Learning). These techniques have a particular focus on data encryption or secure local computation. They transfer the intermediate information to the third-party to compute the final result. Gradient exchanging is commonly considered to be a secure way of training a robust model collaboratively in deep learning. However, recent researches have demonstrated that sensitive information can be recovered from the shared gradient. Generative Adversarial Networks (GAN), in particular, have shown to be effective in recovering those information. However, GAN based techniques require additional information, such as class labels which are generally unavailable for privacy persevered learning. In this paper, we show that, in Federated Learning (FL) system, image-based privacy data can be easily recovered in full from the shared gradient only via our proposed Generative Regression Neural Network (GRNN). We formulate the attack to be a regression problem and optimise two branches of the generative model by minimising the distance between gradients. We evaluate our method on several image classification tasks. The results illustrate that our proposed GRNN outperforms state-of-the-art methods with better stability, stronger robustness, and higher accuracy. It also has no convergence requirement to the global FL model. Moreover, we demonstrate information leakage using face re-identification. Some defense strategies are also discussed in this work.
翻译:在机器学习中,数据隐私已成为一个日益重要的问题。许多方法已经制定来解决这个问题,例如密码学(人工成形加密、差异隐私等)和协作培训(安全多党计算、分散学习和联邦学习)等。这些技术特别侧重于数据加密或安全本地计算。它们将中间信息转移给第三方,以计算最终结果。渐进式交换通常被视为一种安全的方式,在深层学习中合作训练一个稳健的模型。然而,最近的研究表明,敏感信息可以从共享梯度中恢复。特别是,General Aversarial 网络(GAN)在恢复这些信息方面证明是有效的。然而,基于GAN的技术需要额外的信息,例如通常无法为隐私持续学习提供类标签。在Freedal(FL)系统中,基于图像的隐私数据只能通过我们提议的Genealation Regression Neural 网络(GNN)模型(GNNN)来完全从共享的渐变模型中恢复。我们用较稳性递增性递增性递增式的模型来评估了我们的一些越轨方法。我们用较强的递增性递增式的越轨方法来评估了一些的越轨法。我们用了一些的越轨方法来评估了我们的越轨方法。我们比较的递增式的递增式的递增式的越轨法。我们比较式的越轨法。