利用多式联运目标嵌入式的零热物体-目标导航 (ZSON: Zero-Shot Object-Goal Navigation using Multimodal Goal Embeddings) - 专知论文

会员服务 ·

0

Agent · 多峰值 · Extensibility · 回合 · Projection ·

2022 年 6 月 24 日

ZSON: Zero-Shot Object-Goal Navigation using Multimodal Goal Embeddings

翻译：利用多式联运目标嵌入式的零热物体-目标导航

Arjun Majumdar,Gunjan Aggarwal,Bhavika Devnani,Judy Hoffman,Dhruv Batra

We present a scalable approach for learning open-world object-goal navigation (ObjectNav) -- the task of asking a virtual robot (agent) to find any instance of an object in an unexplored environment (e.g., "find a sink"). Our approach is entirely zero-shot -- i.e., it does not require ObjectNav rewards or demonstrations of any kind. Instead, we train on the image-goal navigation (ImageNav) task, in which agents find the location where a picture (i.e., goal image) was captured. Specifically, we encode goal images into a multimodal, semantic embedding space to enable training semantic-goal navigation (SemanticNav) agents at scale in unannotated 3D environments (e.g., HM3D). After training, SemanticNav agents can be instructed to find objects described in free-form natural language (e.g., "sink", "bathroom sink", etc.) by projecting language goals into the same multimodal, semantic embedding space. As a result, our approach enables open-world ObjectNav. We extensively evaluate our agents on three ObjectNav datasets (Gibson, HM3D, and MP3D) and observe absolute improvements in success of 4.2% - 20.0% over existing zero-shot methods. For reference, these gains are similar or better than the 5% improvement in success between the Habitat 2020 and 2021 ObjectNav challenge winners. In an open-world setting, we discover that our agents can generalize to compound instructions with a room explicitly mentioned (e.g., "Find a kitchen sink") and when the target room can be inferred (e.g., "Find a sink and a stove").

翻译：我们提出了一个用于学习开放世界天目标导航(ObjectNav)的可扩缩方法, 即要求虚拟机器人( 代理) 在未探索的环境中查找对象( 例如“ 找到一个水槽 ” ) 。我们的方法完全零射 - 即它不需要“ 目标” 奖赏或任何类型的演示。相反, 我们对图像目标导航( ImagNav) 任务进行培训, 代理员在其中找到拍摄目标( 即目标图像) 的地点。具体来说, 我们将目标图像编码成一个mod, 语言嵌入空间, 语言嵌入空间, 语言嵌入空间, 语言嵌入空间, 以在 una 3D 环境( 如 HM3D ) 上进行比例化培训。我们的方法是“ 目标- 目标目标导航( 如“ 目标” 目标) 奖赏或任何种类的演示。培训后, 可以指示Smantic Nav 代理员找到以自由格式自然语言描述的物体( 如“ 嵌入 ”, “ 目标或“ 挑战 ” 等) 。通过将语言目标目标目标目标目标投入,, 放在我们更清晰的参照器中,,, 。通过我们更高级、嵌嵌嵌嵌入空间中, 、运行中, 3 等。

0

相关内容

Agent

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

专知会员服务

15+阅读 · 2020年5月5日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

ExBert — 可视化分析Transformer学到的表示

ExBert — 可视化分析Transformer学到的表示

专知会员服务

32+阅读 · 2019年10月16日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新七篇知识图谱相关论文—嵌入式知识、Zero-shot识别、知识图谱嵌入、网络库、变分推理、解释、弱监督

【论文推荐】最新七篇知识图谱相关论文—嵌入式知识、Zero-shot识别、知识图谱嵌入、网络库、变分推理、解释、弱监督

专知

19+阅读 · 2018年3月26日

microRNA介导Vaspin调控动脉钙化的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

三维裂隙岩体水力耦合多尺度分析相场模型研究

国家自然科学基金

0+阅读 · 2014年12月31日

非饱和颗粒材料水力-力学耦合过程两尺度分析的二阶计算均匀化方法

国家自然科学基金

0+阅读 · 2012年12月31日

高效中红外激光晶体Cr,Er,Re:YSGG（Re＝Eu3+, Tb3+）的生长及性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

Gata6对血管损伤修复和动脉粥样硬化形成的作用及其机制

国家自然科学基金

0+阅读 · 2011年12月31日

动态载荷作用下压电材料的断裂

国家自然科学基金

0+阅读 · 2011年12月31日

颗粒材料中的偶应力效应及Cosserat介质本构模拟研究

国家自然科学基金

0+阅读 · 2011年12月31日

MRI动态监测小肠缺血再灌注损伤肠上皮细胞内Ca2+变化的实验研究

国家自然科学基金

0+阅读 · 2010年12月31日

超大跨斜拉桥颤振形态全过程数值模拟及机理研究

国家自然科学基金

0+阅读 · 2008年12月31日

深井围岩动态载荷下诱发冲击灾害演化机理

国家自然科学基金

0+阅读 · 2008年12月31日

An Empirical Study of Pseudo-Labeling for Image-based 3D Object Detection

Arxiv

0+阅读 · 2022年8月15日

Inductive Biases for Object-Centric Representations in the Presence of Complex Textures

Arxiv

0+阅读 · 2022年8月15日

Structural Biases for Improving Transformers on Translation into Morphologically Rich Languages

Arxiv

0+阅读 · 2022年8月11日

Updating Embeddings for Dynamic Knowledge Graphs

Arxiv

20+阅读 · 2021年9月22日

Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs

Arxiv

40+阅读 · 2019年6月4日

Text Generation from Knowledge Graphs with Graph Transformers

Arxiv

35+阅读 · 2019年4月4日

Rethinking Knowledge Graph Propagation for Zero-Shot Learning

Arxiv

17+阅读 · 2018年5月31日

Zero-shot Recognition via Semantic Embeddings and Knowledge Graphs

Arxiv

18+阅读 · 2018年4月8日

Ripple Network: Propagating User Preferences on the Knowledge Graph for Recommender Systems

Arxiv

12+阅读 · 2018年3月9日

Re-ID done right: towards good practices for person re-identification

Arxiv

14+阅读 · 2018年1月16日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

专知会员服务

15+阅读 · 2020年5月5日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

ExBert — 可视化分析Transformer学到的表示

ExBert — 可视化分析Transformer学到的表示

专知会员服务

32+阅读 · 2019年10月16日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

【斯坦福博士论文】基础模型后训练的新方法

欧盟防务准备路线图：目标、冲突与2030之路（附“2030年防务准备路线图”原文）

【AAAI2026】模型不确定性下的在线鲁棒规划：一种基于采样的方法

Transformers 出现以来关系抽取任务的系统综述

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新七篇知识图谱相关论文—嵌入式知识、Zero-shot识别、知识图谱嵌入、网络库、变分推理、解释、弱监督

【论文推荐】最新七篇知识图谱相关论文—嵌入式知识、Zero-shot识别、知识图谱嵌入、网络库、变分推理、解释、弱监督

专知

19+阅读 · 2018年3月26日

相关论文

An Empirical Study of Pseudo-Labeling for Image-based 3D Object Detection

Arxiv

0+阅读 · 2022年8月15日

Inductive Biases for Object-Centric Representations in the Presence of Complex Textures

Arxiv

0+阅读 · 2022年8月15日

Structural Biases for Improving Transformers on Translation into Morphologically Rich Languages

Arxiv

0+阅读 · 2022年8月11日

Updating Embeddings for Dynamic Knowledge Graphs

Arxiv

20+阅读 · 2021年9月22日

Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs

Arxiv

40+阅读 · 2019年6月4日

Text Generation from Knowledge Graphs with Graph Transformers

Arxiv

35+阅读 · 2019年4月4日

Rethinking Knowledge Graph Propagation for Zero-Shot Learning

Arxiv

17+阅读 · 2018年5月31日

Zero-shot Recognition via Semantic Embeddings and Knowledge Graphs

Arxiv

18+阅读 · 2018年4月8日

Ripple Network: Propagating User Preferences on the Knowledge Graph for Recommender Systems

Arxiv

12+阅读 · 2018年3月9日

Re-ID done right: towards good practices for person re-identification

Arxiv

14+阅读 · 2018年1月16日

相关基金

microRNA介导Vaspin调控动脉钙化的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

三维裂隙岩体水力耦合多尺度分析相场模型研究

国家自然科学基金

0+阅读 · 2014年12月31日

非饱和颗粒材料水力-力学耦合过程两尺度分析的二阶计算均匀化方法

国家自然科学基金

0+阅读 · 2012年12月31日

高效中红外激光晶体Cr,Er,Re:YSGG（Re＝Eu3+, Tb3+）的生长及性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

Gata6对血管损伤修复和动脉粥样硬化形成的作用及其机制

国家自然科学基金

0+阅读 · 2011年12月31日

动态载荷作用下压电材料的断裂

国家自然科学基金

0+阅读 · 2011年12月31日

颗粒材料中的偶应力效应及Cosserat介质本构模拟研究

国家自然科学基金

0+阅读 · 2011年12月31日

MRI动态监测小肠缺血再灌注损伤肠上皮细胞内Ca2+变化的实验研究

国家自然科学基金

0+阅读 · 2010年12月31日

超大跨斜拉桥颤振形态全过程数值模拟及机理研究

国家自然科学基金

0+阅读 · 2008年12月31日

深井围岩动态载荷下诱发冲击灾害演化机理

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员