个人化联邦学习替代方案理论 (A Theorem of the Alternative for Personalized Federated Learning)

A widely recognized difficulty in federated learning arises from the statistical heterogeneity among clients: local datasets often come from different but not entirely unrelated distributions, and personalization is, therefore, necessary to achieve optimal results from each individual's perspective. In this paper, we show how the excess risks of personalized federated learning with a smooth, strongly convex loss depend on data heterogeneity from a minimax point of view. Our analysis reveals a surprising theorem of the alternative for personalized federated learning: there exists a threshold such that (a) if a certain measure of data heterogeneity is below this threshold, the FedAvg algorithm [McMahan et al., 2017] is minimax optimal; (b) when the measure of heterogeneity is above this threshold, then doing pure local training (i.e., clients solve empirical risk minimization problems on their local datasets without any communication) is minimax optimal. As an implication, our results show that the presumably difficult (infinite-dimensional) problem of adapting to client-wise heterogeneity can be reduced to a simple binary decision problem of choosing between the two baseline algorithms. Our analysis relies on a new notion of algorithmic stability that takes into account the nature of federated learning.

翻译：在联谊学习中,广泛认识到的难度来自客户之间的统计差异性:本地数据集往往来自不同但并非完全无关的分布,因此个人化对于从每个人的角度取得最佳结果是必要的。在本文中,我们展示了个人化联谊学习的过度风险如何取决于数据差异性,而光滑、强烈的 convex损失取决于从迷你最大点看的数据差异性。我们的分析显示,个人化联谊学习的替代方法有惊人的理论:存在一个门槛,即(a) 如果某种数据异质性含量低于这一阈值,那么FedAvg算法[McMahan等人,2017年]是最理想的; (b) 当个人化联谊学习的过度风险度高于这一阈值时,那么纯粹的当地培训(即客户在没有任何通信的情况下解决当地数据集的经验风险最小化问题)是最佳的。我们的结果表明,在对客户角度的高度差异性分析中,可能存在某种(绝对的)难以适应客户性亚运算法性质分析的硬性问题。

相关内容

联邦学习

关注 199

联邦学习（Federated Learning）是一种新兴的人工智能基础技术，在 2016 年由谷歌最先提出，原本用于解决安卓手机终端用户在本地更新模型的问题，其设计目标是在保障大数据交换时的信息安全、保护终端数据和个人数据隐私、保证合法合规的前提下，在多参与方或多计算结点之间开展高效率的机器学习。其中，联邦学习可使用的机器学习算法不局限于神经网络，还包括随机森林等重要算法。联邦学习有望成为下一代人工智能协同算法和协作网络的基础。