与共同嵌入的预测架构一起从图像中学习 (Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture) - 专知论文

会员服务 ·

0

块 · Learning · 表示 · INFORMS · Performer ·

2023 年 1 月 19 日

Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture

翻译：与共同嵌入的预测架构一起从图像中学习

Mahmoud Assran,Quentin Duval,Ishan Misra,Piotr Bojanowski,Pascal Vincent,Michael Rabbat,Yann LeCun,Nicolas Ballas

This paper demonstrates an approach for learning highly semantic image representations without relying on hand-crafted data-augmentations. We introduce the Image-based Joint-Embedding Predictive Architecture (I-JEPA), a non-generative approach for self-supervised learning from images. The idea behind I-JEPA is simple: from a single context block, predict the representations of various target blocks in the same image. A core design choice to guide I-JEPA towards producing semantic representations is the masking strategy; specifically, it is crucial to (a) predict several target blocks in the image, (b) sample target blocks with sufficiently large scale (occupying 15%-20% of the image), and (c) use a sufficiently informative (spatially distributed) context block. Empirically, when combined with Vision Transformers, we find I-JEPA to be highly scalable. For instance, we train a ViT-Huge/16 on ImageNet using 32 A100 GPUs in under 38 hours to achieve strong downstream performance across a wide range of tasks requiring various levels of abstraction, from linear classification to object counting and depth prediction.

翻译：本文展示了一种在不依赖手动制作的数据放大的情况下学习高度语义图像表达方式的方法。我们引入了基于图像的联合嵌入式预测结构(I-JEPA),这是从图像中学习自我监督的不创新的方法。 I-JEPA背后的想法很简单:从一个单一的上下文块中,预测同一图像中各个目标块的表示方式。掩码战略是指导I-JEPA制作语义表达方式的核心设计选择;具体来说,它对于(a) 预测图像中的若干目标块至关重要,(b) 具有足够大规模的样本目标块(占图像的15%至20%),以及(c) 使用充分信息化(分布式分布式分布式分布式分布式分布式分布式分布式分布式分布式)的背景块。我们发现I-JEPA与视觉变异器相结合时,我们发现高可缩缩略。例如,我们用38小时以下的32 A100 GS对图像网络进行VIT-HO/16进行图像网络培训,以在需要从线性到深度和深度等一系列广泛任务中实现强有力的下游运行。

0

相关内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Orexin/OX1R激动FOXO1/Atg7干预胰岛β细胞自噬的机制及其在胰岛功能缺陷中的意义

国家自然科学基金

0+阅读 · 2014年12月31日

大块金属玻璃微观结构演变的原位同步辐射研究

国家自然科学基金

0+阅读 · 2013年12月31日

陶瓷表层结构梯度化缓解SiO2f/SiO2复合陶瓷/金属钎焊接头残余热应力的方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

潜手性双烯配位的手性金属络合物的合成及应用

国家自然科学基金

0+阅读 · 2012年12月31日

新型Re(I)配合物磷光材料的设计、合成及其光电性能研究

国家自然科学基金

1+阅读 · 2012年12月31日

ING3：原发性肝癌的诊断与治疗新靶点

国家自然科学基金

0+阅读 · 2012年12月31日

高维数据的空间变系数模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

多层结构含水层系统土体变形机理和地面沉降预测

国家自然科学基金

0+阅读 · 2012年12月31日

对称空间中流动和传热机理研究及其工程应用

国家自然科学基金

0+阅读 · 2012年12月31日

裂隙岩体隧道开挖扰动区三维裂纹扩展与止裂加固机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

PiMAE: Point Cloud and Image Interactive Masked Autoencoders for 3D Object Detection

Arxiv

1+阅读 · 2023年3月14日

I3D: Transformer architectures with input-dependent dynamic depth for speech recognition

Arxiv

0+阅读 · 2023年3月14日

Revisiting Class-Incremental Learning with Pre-Trained Models: Generalizability and Adaptivity are All You Need

Arxiv

0+阅读 · 2023年3月13日

Prototype-based Embedding Network for Scene Graph Generation

Arxiv

0+阅读 · 2023年3月13日

Representation Learning by Detecting Incorrect Location Embeddings

Arxiv

0+阅读 · 2023年3月13日

Accurate Real-time Polyp Detection in Videos from Concatenation of Latent Features Extracted from Consecutive Frames

Arxiv

0+阅读 · 2023年3月10日

Weakly-Supervised HOI Detection from Interaction Labels Only and Language/Vision-Language Priors

Arxiv

0+阅读 · 2023年3月9日

Learning Embedding Adaptation for Few-Shot Learning

Learning Embedding Adaptation for Few-Shot Learning

Arxiv

17+阅读 · 2018年12月10日

Dissecting Contextual Word Embeddings: Architecture and Representation

Dissecting Contextual Word Embeddings: Architecture and Representation

Arxiv

22+阅读 · 2018年8月27日

Mobile Video Object Detection with Temporally-Aware Feature Maps

Arxiv

11+阅读 · 2018年3月28日

VIP会员

文章信息

相关主题

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】面向可扩展深度神经网络的预测编码：理论与实践

如何快速获取数百万架无人机？

EMNLP 2025 | RTQA：递归思想求解复杂的时间知识图谱问答

组合式零样本学习综述

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

PiMAE: Point Cloud and Image Interactive Masked Autoencoders for 3D Object Detection

Arxiv

1+阅读 · 2023年3月14日

I3D: Transformer architectures with input-dependent dynamic depth for speech recognition

Arxiv

0+阅读 · 2023年3月14日

Revisiting Class-Incremental Learning with Pre-Trained Models: Generalizability and Adaptivity are All You Need

Arxiv

0+阅读 · 2023年3月13日

Prototype-based Embedding Network for Scene Graph Generation

Arxiv

0+阅读 · 2023年3月13日

Representation Learning by Detecting Incorrect Location Embeddings

Arxiv

0+阅读 · 2023年3月13日

Accurate Real-time Polyp Detection in Videos from Concatenation of Latent Features Extracted from Consecutive Frames

Arxiv

0+阅读 · 2023年3月10日

Weakly-Supervised HOI Detection from Interaction Labels Only and Language/Vision-Language Priors

Arxiv

0+阅读 · 2023年3月9日

Learning Embedding Adaptation for Few-Shot Learning

Learning Embedding Adaptation for Few-Shot Learning

Arxiv

17+阅读 · 2018年12月10日

Dissecting Contextual Word Embeddings: Architecture and Representation

Dissecting Contextual Word Embeddings: Architecture and Representation

Arxiv

22+阅读 · 2018年8月27日

Mobile Video Object Detection with Temporally-Aware Feature Maps

Arxiv

11+阅读 · 2018年3月28日

相关基金

Orexin/OX1R激动FOXO1/Atg7干预胰岛β细胞自噬的机制及其在胰岛功能缺陷中的意义

国家自然科学基金

0+阅读 · 2014年12月31日

大块金属玻璃微观结构演变的原位同步辐射研究

国家自然科学基金

0+阅读 · 2013年12月31日

陶瓷表层结构梯度化缓解SiO2f/SiO2复合陶瓷/金属钎焊接头残余热应力的方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

潜手性双烯配位的手性金属络合物的合成及应用

国家自然科学基金

0+阅读 · 2012年12月31日

新型Re(I)配合物磷光材料的设计、合成及其光电性能研究

国家自然科学基金

1+阅读 · 2012年12月31日

ING3：原发性肝癌的诊断与治疗新靶点

国家自然科学基金

0+阅读 · 2012年12月31日

高维数据的空间变系数模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

多层结构含水层系统土体变形机理和地面沉降预测

国家自然科学基金

0+阅读 · 2012年12月31日

对称空间中流动和传热机理研究及其工程应用

国家自然科学基金

0+阅读 · 2012年12月31日

裂隙岩体隧道开挖扰动区三维裂纹扩展与止裂加固机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员