源源源预培训模式的渐进域域适应 (Progressive Domain Adaptation from a Source Pre-trained Model)

Domain adaptation assumes that samples from source and target domains are freely accessible during a training phase. However, such an assumption is rarely plausible in the real-world and possibly causes data-privacy issues, especially when the label of the source domain can be a sensitive attribute as an identifier. To avoid accessing source data that may contain sensitive information, we introduce progressive domain adaptation (PrDA). Our key idea is to leverage a pre-trained model from the source domain and progressively update the target model in a self-learning manner. We observe that target samples with lower self-entropy measured by the pre-trained source model are more likely to be classified correctly. From this, we select the reliable samples with the self-entropy criterion and define these as class prototypes. We then assign pseudo labels for every target sample based on the similarity score with class prototypes. Furthermore, to reduce the uncertainty from the pseudo labeling process, we propose set-to-set distance-based filtering which does not require any tunable hyperparameters. Finally, we train the target model with the filtered pseudo labels with regularization from the pre-trained source model. Surprisingly, without direct usage of labeled source samples, our PrDA outperforms conventional domain adaptation methods on benchmark datasets. Our code is publicly available at https://github.com/youngryan1993/PrDA-Progressive-Domain-Adaptation-from-a-Source-Pre-trained-Model.

翻译：校内调适假设源域和目标域的样本在培训阶段可以自由获取。然而,在现实世界中,这种假设很少可信,并可能造成数据隐私问题,特别是当源域的标签可能是一个敏感属性作为识别符号时。为避免获取可能包含敏感信息的源数据,我们引入了渐进域适应(PrDA) 。我们的关键想法是利用源域预先培训的模型,并逐步以自学的方式更新目标模型。我们观察到,通过经过培训的源模型测量的自湿度较低的目标样本更有可能被正确分类。我们从中选择了具有自湿度标准的可靠样本,并将这些样本定义为类原型。我们随后根据与类原型相似的评分为每个目标样本指定假标签。此外,为了减少来自源域的假标签过程的不确定性,我们建议采用固定到定的远程过滤器,而不需要任何金枪鱼可量的超度。最后,我们用经培训的过滤的假标签模型来进行分类。我们从经过培训的自湿度标准的源模型中选择了可靠的样本,然后将这些样本定义为类原型。我们用常规的校内校内制数据直接用于。我们在数据库的校内校内校内校内校内校内校内校外的校内校内校内数据。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【干货书】Python程序员编程，810页pdf，Python® for Programmers

专知会员服务

62+阅读 · 2020年8月6日

【CVPR2020-Facebook AI】单样本自适应域脸生成，One-Shot Domain Adaptation

专知会员服务

29+阅读 · 2020年4月6日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日