Zero-shot learning is a learning regime that recognizes unseen classes by generalizing the visual-semantic relationship learned from the seen classes. To obtain an effective ZSL model, one may resort to curating training samples from multiple sources, which may inevitably raise the privacy concerns about data sharing across different organizations. In this paper, we propose a novel Federated Zero-Shot Learning FedZSL framework, which learns a central model from the decentralized data residing on edge devices. To better generalize to previously unseen classes, FedZSL allows the training data on each device sampled from the non-overlapping classes, which are far from the i.i.d. that traditional federated learning commonly assumes. We identify two key challenges in our FedZSL protocol: 1) the trained models are prone to be biased to the locally observed classes, thus failing to generalize to the unseen classes and/or seen classes appeared on other devices; 2) as each category in the training data comes from a single source, the central model is highly vulnerable to model replacement (backdoor) attacks. To address these issues, we propose three local objectives for visual-semantic alignment and cross-device alignment through relation distillation, which leverages the normalized class-wise covariance to regularize the consistency of the prediction logits across devices. To defend against the backdoor attacks, a feature magnitude defending technique is proposed. As malicious samples are less correlated to the given semantic attributes, the visual features of low magnitude will be discarded to stabilize model updates. The effectiveness and robustness of FedZSL are demonstrated by extensive experiments conducted on three zero-shot benchmark datasets.
翻译:零点学习是一种学习制度,它通过推广从可见的分类中学到的视觉-语义关系,承认隐蔽的班级。为了获得有效的 ZSL 模型,人们可以使用从多种来源整理培训样本的方法,这不可避免地会增加不同组织之间数据共享的隐私问题。在本文中,我们提议了一个全新的Feded-ZSL框架,从位于边缘装置的分散数据中学习一个核心模型。为了更好地概括到先前看不见的班级,FedZSL允许从非重叠类中抽取的每个设备的培训数据,这些数据远离传统联合学习的i.i.d。我们在FedZSL协议中确定了两个主要挑战:(1) 受过训练的模型容易偏向于当地观察的班级,因此无法向隐蔽的班级和(或)其他装置中出现的班级;(2) 由于培训数据中的每一类来自一个单一来源,中央模型很容易被模型替换(后门) 。为了解决这些问题,我们建议三个地方目标,即视觉-Slationality基准比值比值比值比值比值比值比值比值比值比值比值比值比值比值比值比值比值比值比值比值比值比值比值比值比值比值比值比值比值更低。