Federated learning (FL) is a trending training paradigm to utilize decentralized training data. FL allows clients to update model parameters locally for several epochs, then share them to a global model for aggregation. This training paradigm with multi-local step updating before aggregation exposes unique vulnerabilities to adversarial attacks. Adversarial training is a popular and effective method to improve the robustness of networks against adversaries. In this work, we formulate a general form of federated adversarial learning (FAL) that is adapted from adversarial learning in the centralized setting. On the client side of FL training, FAL has an inner loop to generate adversarial samples for adversarial training and an outer loop to update local model parameters. On the server side, FAL aggregates local model updates and broadcast the aggregated model. We design a global robust training loss and formulate FAL training as a min-max optimization problem. Unlike the convergence analysis in classical centralized training that relies on the gradient direction, it is significantly harder to analyze the convergence in FAL for three reasons: 1) the complexity of min-max optimization, 2) model not updating in the gradient direction due to the multi-local updates on the client-side before aggregation and 3) inter-client heterogeneity. We address these challenges by using appropriate gradient approximation and coupling techniques and present the convergence analysis in the over-parameterized regime. Our main result theoretically shows that the minimum loss under our algorithm can converge to $\epsilon$ small with chosen learning rate and communication rounds. It is noteworthy that our analysis is feasible for non-IID clients.
翻译:联邦学习(FL)是利用分散培训数据的一种趋势化培训模式。 FL允许客户在当地更新多个时代的模型参数,然后将其共享到一个全球汇总模式。这种培训模式,在汇总前多地方步骤更新,暴露了对抗性攻击的独特脆弱性。反向培训是一种流行和有效的方法,可以提高网络对对手的稳健性。在这项工作中,我们从集中化环境中的对立学习中制定一种一般形式的联邦对抗性学习(FAL)模式。在FL培训的客户方面,FAL有一个内部循环,可以生成对抗性培训的对立样本和外部循环,以更新当地模型参数。在服务器方面,FAL综合地方模型更新并播出综合模型。我们设计了一个全球强力培训损失,并将FAL培训设计成一个微积分优化问题。与依赖梯度方向的典型集中培训的趋同分析不同,为了集中化,很难分析FAL的趋同程度,有三个原因:(1) 微量优化的复杂性,(2) 不更新梯度方向的模型,因为高压率的客户在多地平级分析前更新了我们的最新方向。