Jigsaw-VIT:在愿景变换器中学习 Jigsaw 拼图 (Jigsaw-ViT: Learning Jigsaw Puzzles in Vision Transformer) - 专知论文

会员服务 ·

0

泛化理论 · 图片分类 · Vision · 变换 · Learning ·

2023 年 1 月 5 日

Jigsaw-ViT: Learning Jigsaw Puzzles in Vision Transformer

翻译：Jigsaw-VIT:在愿景变换器中学习 Jigsaw 拼图

Yingyi Chen,Xi Shen,Yahui Liu,Qinghua Tao,Johan A. K. Suykens

from arxiv, Accepted to Pattern Recognition Letters 2022. Project page: https://yingyichen-cyy.github.io/Jigsaw-ViT/

The success of Vision Transformer (ViT) in various computer vision tasks has promoted the ever-increasing prevalence of this convolution-free network. The fact that ViT works on image patches makes it potentially relevant to the problem of jigsaw puzzle solving, which is a classical self-supervised task aiming at reordering shuffled sequential image patches back to their natural form. Despite its simplicity, solving jigsaw puzzle has been demonstrated to be helpful for diverse tasks using Convolutional Neural Networks (CNNs), such as self-supervised feature representation learning, domain generalization, and fine-grained classification. In this paper, we explore solving jigsaw puzzle as a self-supervised auxiliary loss in ViT for image classification, named Jigsaw-ViT. We show two modifications that can make Jigsaw-ViT superior to standard ViT: discarding positional embeddings and masking patches randomly. Yet simple, we find that Jigsaw-ViT is able to improve both in generalization and robustness over the standard ViT, which is usually rather a trade-off. Experimentally, we show that adding the jigsaw puzzle branch provides better generalization than ViT on large-scale image classification on ImageNet. Moreover, the auxiliary task also improves robustness to noisy labels on Animal-10N, Food-101N, and Clothing1M as well as adversarial examples. Our implementation is available at https://yingyichen-cyy.github.io/Jigsaw-ViT/.

翻译：视觉变异器(Vigs Greanger)在各种计算机视觉任务中的成功促进了这种无革命性网络的日益普及。 ViT在图像补丁上工作,使得它有可能与拼图解谜题问题相关,而拼图解谜题是一个典型的自我监督任务,目的是重新排序被打乱的连续图像补丁,使其恢复到自然形式。尽管它简单,但解决拼图拼图难题已证明有助于使用 Convolual Neal网络(CNNs)来完成各种任务,例如自我监督的特征演示学习、域域域化和精细的分类。在本文中,我们探讨将拼图拼图拼图拼图作为维格解解解谜的一个自监督的辅助损失来解决。我们展示了两个修改,使 Jigsaw-ViT 高于标准格式: 丢弃定位嵌嵌入和随机掩蔽。然而, Jigsaw- Vialyalaling- Vialoff 能够改进标准ViT的通用和坚固度, 通常比交易/Sildal-LIal-Ial-Ialations 提供了一个更大规模的分类。

0

相关内容

泛化理论

【CVPR 2022】基于灵活模态Transformer的人脸防伪 FM-ViT: Flexible Modal Vision Transformers for Face Anti-Spoofing

【CVPR 2022】基于灵活模态Transformer的人脸防伪 FM-ViT: Flexible Modal Vision Transformers for Face Anti-Spoofing

专知会员服务

16+阅读 · 2022年3月19日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

289+阅读 · 2020年11月26日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

59+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

161+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

92+阅读 · 2020年3月12日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

35+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

77+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

64+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

100+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

39+阅读 · 2019年10月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

131+阅读 · 2020年3月18日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

39+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

23+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

26+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

基于GATA2敲除及GATA2-GFP标记的hES细胞系研究内皮-造血转变的调控机制

国家自然科学基金

0+阅读 · 2015年12月31日

Cbl家族调控c-Met介导的非小细胞肺癌放疗抵抗机制的研究

国家自然科学基金

1+阅读 · 2014年12月31日

可还原降解的聚合物纳米凝胶用作抗肿瘤药物载体的研究

国家自然科学基金

0+阅读 · 2014年12月31日

锂空电池中氧催化剂和离子液体电解液构建与匹配

国家自然科学基金

0+阅读 · 2012年12月31日

GSK-3β调控血管平滑肌细胞特异性转录因子Myocardin对动脉粥样硬化斑块形成作用及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

具有集成功能的高强度光敏凝胶的研究

国家自然科学基金

0+阅读 · 2011年12月31日

具有生物分子识别功能可生物降解高分子材料的合成与应用

国家自然科学基金

0+阅读 · 2009年12月31日

序贯诱导重编程的自体多潜能干细胞分化为视网膜神经细胞

国家自然科学基金

0+阅读 · 2009年12月31日

面向复杂产品开发的元建模理论及协同仿真方法研究

国家自然科学基金

0+阅读 · 2008年12月31日

水性高分子复合型胶黏剂用功能性交联剂的稳定与交联固化机理研究

国家自然科学基金

0+阅读 · 2008年12月31日

Learning Open-vocabulary Semantic Segmentation Models From Natural Language Supervision

Arxiv

0+阅读 · 2023年3月3日

Can BERT Refrain from Forgetting on Sequential Tasks? A Probing Study

Arxiv

0+阅读 · 2023年3月2日

CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Representation Alignment

Arxiv

0+阅读 · 2023年3月2日

Multi-Task Self-Supervised Time-Series Representation Learning

Arxiv

0+阅读 · 2023年3月2日

Can representation learning for multimodal image registration be improved by supervision of intermediate layers?

Arxiv

0+阅读 · 2023年3月1日

Conditional Prompt Learning for Vision-Language Models

Conditional Prompt Learning for Vision-Language Models

Arxiv

13+阅读 · 2022年3月10日

Optimizing Reusable Knowledge for Continual Learning via Metalearning

Arxiv

15+阅读 · 2021年6月9日

SiT: Self-supervised vIsion Transformer

Arxiv

19+阅读 · 2021年4月8日

Self-correcting Q-Learning

Arxiv

11+阅读 · 2020年12月2日

Pre-training Text Representations as Meta Learning

Arxiv

13+阅读 · 2020年4月12日

VIP会员

文章信息

相关主题

相关VIP内容

【CVPR 2022】基于灵活模态Transformer的人脸防伪 FM-ViT: Flexible Modal Vision Transformers for Face Anti-Spoofing

【CVPR 2022】基于灵活模态Transformer的人脸防伪 FM-ViT: Flexible Modal Vision Transformers for Face Anti-Spoofing

专知会员服务

16+阅读 · 2022年3月19日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

289+阅读 · 2020年11月26日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

59+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

161+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

92+阅读 · 2020年3月12日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

35+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

77+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

64+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

100+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

39+阅读 · 2019年10月9日

热门VIP内容

相关资讯

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

131+阅读 · 2020年3月18日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

39+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

23+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

26+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

相关论文

Learning Open-vocabulary Semantic Segmentation Models From Natural Language Supervision

Arxiv

0+阅读 · 2023年3月3日

Can BERT Refrain from Forgetting on Sequential Tasks? A Probing Study

Arxiv

0+阅读 · 2023年3月2日

CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Representation Alignment

Arxiv

0+阅读 · 2023年3月2日

Multi-Task Self-Supervised Time-Series Representation Learning

Arxiv

0+阅读 · 2023年3月2日

Can representation learning for multimodal image registration be improved by supervision of intermediate layers?

Arxiv

0+阅读 · 2023年3月1日

Conditional Prompt Learning for Vision-Language Models

Conditional Prompt Learning for Vision-Language Models

Arxiv

13+阅读 · 2022年3月10日

Optimizing Reusable Knowledge for Continual Learning via Metalearning

Arxiv

15+阅读 · 2021年6月9日

SiT: Self-supervised vIsion Transformer

Arxiv

19+阅读 · 2021年4月8日

Self-correcting Q-Learning

Arxiv

11+阅读 · 2020年12月2日

Pre-training Text Representations as Meta Learning

Arxiv

13+阅读 · 2020年4月12日

相关基金

基于GATA2敲除及GATA2-GFP标记的hES细胞系研究内皮-造血转变的调控机制

国家自然科学基金

0+阅读 · 2015年12月31日

Cbl家族调控c-Met介导的非小细胞肺癌放疗抵抗机制的研究

国家自然科学基金

1+阅读 · 2014年12月31日

可还原降解的聚合物纳米凝胶用作抗肿瘤药物载体的研究

国家自然科学基金

0+阅读 · 2014年12月31日

锂空电池中氧催化剂和离子液体电解液构建与匹配

国家自然科学基金

0+阅读 · 2012年12月31日

GSK-3β调控血管平滑肌细胞特异性转录因子Myocardin对动脉粥样硬化斑块形成作用及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

具有集成功能的高强度光敏凝胶的研究

国家自然科学基金

0+阅读 · 2011年12月31日

具有生物分子识别功能可生物降解高分子材料的合成与应用

国家自然科学基金

0+阅读 · 2009年12月31日

序贯诱导重编程的自体多潜能干细胞分化为视网膜神经细胞

国家自然科学基金

0+阅读 · 2009年12月31日

面向复杂产品开发的元建模理论及协同仿真方法研究

国家自然科学基金

0+阅读 · 2008年12月31日

水性高分子复合型胶黏剂用功能性交联剂的稳定与交联固化机理研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员