We present a novel theoretical analysis of Federated SARSA (FedSARSA) with linear function approximation and local training. We establish convergence guarantees for FedSARSA in the presence of heterogeneity, both in local transitions and rewards, providing the first sample and communication complexity bounds in this setting. At the core of our analysis is a new, exact multi-step error expansion for single-agent SARSA, which is of independent interest. Our analysis precisely quantifies the impact of heterogeneity, demonstrating the convergence of FedSARSA with multiple local updates. Crucially, we show that FedSARSA achieves linear speed-up with respect to the number of agents, up to higher-order terms due to Markovian sampling. Numerical experiments support our theoretical findings.
翻译:本文针对采用线性函数逼近和本地训练的联邦SARSA(FedSARSA)算法提出了一种新颖的理论分析。我们建立了在局部状态转移与奖励函数均存在异构性的场景下FedSARSA的收敛性保证,首次给出了该设定下的样本复杂度与通信复杂度上界。我们分析的核心是对单智能体SARSA算法提出的一个新颖且精确的多步误差展开式,该结果本身具有独立的研究价值。我们的分析精确量化了异构性对算法的影响,证明了FedSARSA在多次本地更新下的收敛性。关键的是,我们证明了FedSARSA能够实现关于智能体数量的线性加速(因马尔可夫采样产生的高阶项除外)。数值实验验证了我们的理论结论。