The increasing size of data generated by smartphones and IoT devices motivated the development of Federated Learning (FL), a framework for on-device collaborative training of machine learning models. First efforts in FL focused on learning a single global model with good average performance across clients, but the global model may be arbitrarily bad for a given client, due to the inherent heterogeneity of local data distributions. Federated multi-task learning (MTL) approaches can learn personalized models by formulating an opportune penalized optimization problem. The penalization term can capture complex relations among personalized models, but eschews clear statistical assumptions about local data distributions. In this work, we propose to study federated MTL under the flexible assumption that each local data distribution is a mixture of unknown underlying distributions. This assumption encompasses most of the existing personalized FL approaches and leads to federated EM-like algorithms for both client-server and fully decentralized settings. Moreover, it provides a principled way to serve personalized models to clients not seen at training time. The algorithms' convergence is analyzed through a novel federated surrogate optimization framework, which can be of general interest. Experimental results on FL benchmarks show that our approach provides models with higher accuracy and fairness than state-of-the-art methods.
翻译:智能手机和IoT设备产生的数据越发庞大,促使了Federal Learning(FL)的发展,这是对机器学习模式进行在线合作培训的框架,Fld Learning(FL)的最初努力侧重于学习一个单一的全球模型,客户之间平均业绩良好,但全球模型对特定客户来说可能是任意的不良,因为当地数据分布具有固有的异质性;Freeden-多任务学习(MTL)方法可以通过开发一个适当的适应性最佳化问题来学习个性化模型。惩罚性术语可以捕捉个人化模型之间的复杂关系,但可以避免对当地数据分布的明确统计假设。在这项工作中,我们提议根据一种灵活的假设,即每个本地数据分布都是未知的基本分布组合组合。这一假设包含了大多数现有的个性化FL方法,并导致对客户-服务器和完全分散的环境下都采用像EM的算法。此外,它提供了一个向在培训时看不到的客户提供个性化模型的有原则性化模型。 算法通过一种新型的联邦化的更精确度方法来分析当地数据的合并。