Federated learning (FL) aims to perform privacy-preserving machine learning on distributed data held by multiple data owners. To this end, FL requires the data owners to perform training locally and share the gradient updates (instead of the private inputs) with the central server, which are then securely aggregated over multiple data owners. Although aggregation by itself does not provably offer privacy protection, prior work showed that it may suffice if the batch size is sufficiently large. In this paper, we propose the Cocktail Party Attack (CPA) that, contrary to prior belief, is able to recover the private inputs from gradients aggregated over a very large batch size. CPA leverages the crucial insight that aggregate gradients from a fully connected layer is a linear combination of its inputs, which leads us to frame gradient inversion as a blind source separation (BSS) problem (informally called the cocktail party problem). We adapt independent component analysis (ICA)--a classic solution to the BSS problem--to recover private inputs for fully-connected and convolutional networks, and show that CPA significantly outperforms prior gradient inversion attacks, scales to ImageNet-sized inputs, and works on large batch sizes of up to 1024.
翻译:联邦学习(FL)旨在对多个数据拥有者持有的分布数据进行保密的机器学习。为此,FL要求数据所有者在当地进行培训,并与中央服务器分享梯度更新(而不是私人投入),然后由多个数据拥有者安全地加以汇总。虽然合并本身并不能提供隐私保护,但先前的工作表明,如果批量规模足够大,就足够了。在本文件中,我们提议鸡尾酒党攻击(CPA),与先前的信念相反,它能够从大量成批量的梯度中回收私人投入。CPA利用关键的认识,即完全连接层的总梯度是其投入的线性组合,这导致我们把梯度作为盲源分离问题(非正式地称为鸡尾酒问题)来设置框架。我们对BSS问题的独立部分分析(ICA)-经典解决方案进行了调整,以回收完全连通和动态网络的私人投入,并显示CPA在前梯度攻击、图像网络规模至10级的梯度上大大超出前梯度,并进行大宗规模的工程。