Deep supervised learning has achieved great success in the last decade. However, its deficiencies of dependence on manual labels and vulnerability to attacks have driven people to explore a better solution. As an alternative, self-supervised learning attracts many researchers for its soaring performance on representation learning in the last several years. Self-supervised representation learning leverages input data itself as supervision and benefits almost all types of downstream tasks. In this survey, we take a look into new self-supervised learning methods for representation in computer vision, natural language processing, and graph learning. We comprehensively review the existing empirical methods and summarize them into three main categories according to their objectives: generative, contrastive, and generative-contrastive (adversarial). We further investigate related theoretical analysis work to provide deeper thoughts on how self-supervised learning works. Finally, we briefly discuss open problems and future directions for self-supervised learning. An outline slide for the survey is provided.
翻译:近十年来,深入监督的学习取得了巨大成功,然而,由于对手工标签的依赖不足和易受攻击的脆弱性,人们不得不探索更好的解决办法。作为一种替代办法,自我监督的学习吸引了许多研究人员在过去几年里在代表性学习方面的飞速表现。自我监督的代议制学习将输入数据本身作为监督和有利于几乎所有类型的下游任务。在本次调查中,我们研究了计算机视觉、自然语言处理和图表学习方面的自我监督的新学习方法。我们全面审查了现有的经验方法,并根据它们的目标将其归纳为三大类:基因化、对比性和基因-互动(对抗性)。我们进一步调查相关的理论分析工作,以更深入地思考自我监督的学习如何奏效。最后,我们简要地讨论了自我监督学习的公开问题和未来方向。我们提供了一份调查幻灯片大纲。