通过明确的上下文表述使代理人行为普遍化 (Generalization of Agent Behavior through Explicit Representation of Context) - 专知论文

会员服务 ·

0

泛化理论 · Principle · Flappy Bird · Continuity · INTERACT ·

2020 年 6 月 18 日

Generalization of Agent Behavior through Explicit Representation of Context

翻译：通过明确的上下文表述使代理人行为普遍化

Cem C Tutum,Suhaib Abdulquddos,Risto Miikkulainen

from arxiv, 7 pages, 6 figures, 3 tables. arXiv admin note: substantial text overlap with arXiv:2002.05640

In order to deploy autonomous agents in digital interactive environments, they must be able to act robustly in unseen situations. The standard machine learning approach is to include as much variation as possible into training these agents. The agents can then interpolate within their training, but they cannot extrapolate much beyond it. This paper proposes a principled approach where a context module is coevolved with a skill module in the game. The context module recognizes the temporal variation in the game and modulates the outputs of the skill module so that the action decisions can be made robustly even in previously unseen situations. The approach is evaluated in the Flappy Bird and LunarLander video games, as well as in the CARLA autonomous driving simulation. The Context+Skill approach leads to significantly more robust behavior in environments that require extrapolation beyond training. Such a principled generalization ability is essential in deploying autonomous agents in real-world tasks, and can serve as a foundation for continual adaptation as well.

翻译：为了在数字互动环境中部署自主代理器,他们必须能够在不为人知的情况下采取强有力的行动。标准的机器学习方法是在培训这些代理器时尽可能多地包含差异。代理器然后可以在培训中进行内插, 但是不能外推。本文提出了一个原则性方法, 环境模块与游戏中的技能模块相融合。上下文模块承认游戏的时间差异, 并调整技能模块的输出, 以便即使在以前不为人知的情况下也能强有力地做出行动决定。该方法在Flappy Bird和LunarLander视频游戏以及CARLA自动驾驶模拟中进行了评估。环境+技能方法导致在需要超出培训外推算的环境中采取更强有力的行为。这种有原则性的一般化能力对于在现实世界任务中部署自主代理器至关重要, 并且可以作为持续适应的基础。

0

相关内容

泛化理论

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

【CVPR2020-台大】透视眼：学会透过障碍物看东西，Learning to See Through Obstructions

【CVPR2020-台大】透视眼：学会透过障碍物看东西，Learning to See Through Obstructions

专知会员服务

27+阅读 · 2020年4月3日

【ICML2020投稿论文-DeepMind】时序差分学习的推理与泛化，Temporal Difference Learning

专知会员服务

26+阅读 · 2020年3月16日

【论文推荐】基于元学习的小样本链接预测：FEW SHOT LINK PREDICTION VIA META LEARNING

【论文推荐】基于元学习的小样本链接预测：FEW SHOT LINK PREDICTION VIA META LEARNING

专知会员服务

57+阅读 · 2019年12月23日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Imitation Learning for Fashion Style Based on Hierarchical Multimodal Representation

Imitation Learning for Fashion Style Based on Hierarchical Multimodal Representation

Arxiv

8+阅读 · 2020年4月13日

Learning to See Through Obstructions

Learning to See Through Obstructions

Arxiv

7+阅读 · 2020年4月2日

Learning Disentangled Representations for Recommendation

Learning Disentangled Representations for Recommendation

Arxiv

8+阅读 · 2019年10月31日

Language Modelling Makes Sense: Propagating Representations through WordNet for Full-Coverage Word Sense Disambiguation

Arxiv

3+阅读 · 2019年6月24日

Generalization and Regularization in DQN

Generalization and Regularization in DQN

Arxiv

6+阅读 · 2019年1月30日

Do deep reinforcement learning agents model intentions?

Arxiv

5+阅读 · 2018年5月21日

Neural Network Based Reinforcement Learning for Audio-Visual Gaze Control in Human-Robot Interaction

Arxiv

6+阅读 · 2018年4月23日

Eigenoption Discovery through the Deep Successor Representation

Arxiv

3+阅读 · 2018年1月30日

Multi-Task Learning with Labeled and Unlabeled Tasks

Arxiv

3+阅读 · 2017年6月8日

DeepWalk: Online Learning of Social Representations

Arxiv

8+阅读 · 2014年6月27日

VIP会员

文章信息

相关主题

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

【CVPR2020-台大】透视眼：学会透过障碍物看东西，Learning to See Through Obstructions

【CVPR2020-台大】透视眼：学会透过障碍物看东西，Learning to See Through Obstructions

专知会员服务

27+阅读 · 2020年4月3日

【ICML2020投稿论文-DeepMind】时序差分学习的推理与泛化，Temporal Difference Learning

专知会员服务

26+阅读 · 2020年3月16日

【论文推荐】基于元学习的小样本链接预测：FEW SHOT LINK PREDICTION VIA META LEARNING

【论文推荐】基于元学习的小样本链接预测：FEW SHOT LINK PREDICTION VIA META LEARNING

专知会员服务

57+阅读 · 2019年12月23日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

新质生成式AI赋能产业变革的实践与路径

用于多模态大模型的离散标记化：全面综述

Nature综述：金融网络中的物理学

【CMU博士论文】通信高效且差分隐私的优化方法

相关资讯

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Imitation Learning for Fashion Style Based on Hierarchical Multimodal Representation

Imitation Learning for Fashion Style Based on Hierarchical Multimodal Representation

Arxiv

8+阅读 · 2020年4月13日

Learning to See Through Obstructions

Learning to See Through Obstructions

Arxiv

7+阅读 · 2020年4月2日

Learning Disentangled Representations for Recommendation

Learning Disentangled Representations for Recommendation

Arxiv

8+阅读 · 2019年10月31日

Language Modelling Makes Sense: Propagating Representations through WordNet for Full-Coverage Word Sense Disambiguation

Arxiv

3+阅读 · 2019年6月24日

Generalization and Regularization in DQN

Generalization and Regularization in DQN

Arxiv

6+阅读 · 2019年1月30日

Do deep reinforcement learning agents model intentions?

Arxiv

5+阅读 · 2018年5月21日

Neural Network Based Reinforcement Learning for Audio-Visual Gaze Control in Human-Robot Interaction

Arxiv

6+阅读 · 2018年4月23日

Eigenoption Discovery through the Deep Successor Representation

Arxiv

3+阅读 · 2018年1月30日

Multi-Task Learning with Labeled and Unlabeled Tasks

Arxiv

3+阅读 · 2017年6月8日

DeepWalk: Online Learning of Social Representations

Arxiv

8+阅读 · 2014年6月27日

微信扫码咨询专知VIP会员