State Representation Learning for Control: An Overview

2019 年 1 月 10 日 CreateAMind


论文缺少的是最新的actionable  3d conv:但内容很多

TCN v2 + 3Dconv 运动信息










As an example, infants expect inertial objects to follow principles of persistence, continuity, cohesion and solidity before appearance-based elements such as color, texture and perceptual goodness. At the same time, these principles help guide later learnings such as object’ rigidity, softness and liquids properties. 

Later, adults will reconstruct perceptual scenes using internal representations of the objects and their physically relevant properties (mass, elas- ticity, friction, gravity, collision, etc.) [Lake et al., 2016].


2.2

Using prior knowledge to constrain the state space: A last approach is to handle SRL by using specific constraints or prior knowledge about the functioning, dynamics or physics of the world (besides the constraints of forward and inverse models) such as the temporal continuity or the causality principles that generally reflect the interaction of the agent with objects or in the environment [Jonschkowski and Brock, 2015]. Priors are defined as objective or loss functions L, applied on a set of states s1:n (Fig. 5), to be minimized (or maximized) under specific condition c. An example of condition can be enforcing locality or time proximity within the set of states.

Loss = Lprior(s1:n;θφ|c) (5) All these approaches are detailed in Section 3.



2.3 State representation characteristics

Besides the general idea that the state representation has the role of encoding essential information (for a given task) while discarding irrelevant aspects of the original data, let us detail what the characteristics of a good state representation are.

In a reinforcement learning framework, the authors of [Böhmer et al., 2015] defines a good state representation as a representation that is:

• Markovian, i.e. it summarizes all the necessary information to be able to choose an action within the policy, by looking only at the current state.

• Able to represent the true value of the current state well enough for policy improvement.

• Able to generalize the learned value-function to unseen states with similar futures.

• Low dimensional for efficient estimation.


3 Learning objectives

In this section, we review what objectives can be used to learn a relevant state representation. A schema detailing the core elements involved in each model’s loss function was introduced in Fig. 2 – 5, which highlights the main approaches to be described here. This section touches upon machine learning tools used in SRL such as auto-encoders or siamese networks. A more detailed description of these is later addressed in Section 4.

3.1 Reconstructing the observation

3.2 Learning a forward model

3.3 Learning an inverse model

3.4 Using feature adversarial learning

3.5 Exploiting rewards

3.5 Exploiting rewards

3.6 Other objective functions

Slowness Principle

Variability

Proportionality

Repeatability

Dynamic verification

Selectivity

3.7 Using hybrid objectives

4 Building blocks of State Representation Learning

In this section, we cover various implementation aspects relevant to state representation learning and its evaluation. We refer to specific surrogate models, loss function specification tools or strategies that help constraining the information bottleneck and generalizing when learning low- dimensional state representations,

4.1 Learning tools

We first detail a set of models that through an auxiliary objective function, help learning a state representation. One or several of these learning tools can be integrated in broader SRL approaches as was previously described.

4.1.1 Auto-encoders (AE)4.1.2 Denoising auto-encoders (DAE)

4.1.3 Variational auto-encoders (VAE)

4.1.4 Siamese networks



登录查看更多
1

相关内容

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8
100+篇《自监督学习(Self-Supervised Learning)》论文最新合集
专知会员服务
161+阅读 · 2020年3月18日
专知会员服务
53+阅读 · 2019年12月22日
强化学习最新教程,17页pdf
专知会员服务
168+阅读 · 2019年10月11日
机器学习入门的经验与建议
专知会员服务
90+阅读 · 2019年10月10日
Transferring Knowledge across Learning Processes
CreateAMind
26+阅读 · 2019年5月18日
Unsupervised Learning via Meta-Learning
CreateAMind
41+阅读 · 2019年1月3日
meta learning 17年:MAML SNAIL
CreateAMind
11+阅读 · 2019年1月2日
RL 真经
CreateAMind
5+阅读 · 2018年12月28日
A Technical Overview of AI & ML in 2018 & Trends for 2019
待字闺中
16+阅读 · 2018年12月24日
disentangled-representation-papers
CreateAMind
26+阅读 · 2018年9月12日
vae 相关论文 表示学习 1
CreateAMind
12+阅读 · 2018年9月6日
Reinforcement Learning: An Introduction 2018第二版 500页
CreateAMind
11+阅读 · 2018年4月27日
【推荐】深度学习目标检测概览
机器学习研究会
10+阅读 · 2017年9月1日
Auto-Encoding GAN
CreateAMind
7+阅读 · 2017年8月4日
Arxiv
35+阅读 · 2020年1月2日
Continual Unsupervised Representation Learning
Arxiv
7+阅读 · 2019年10月31日
Arxiv
15+阅读 · 2018年6月23日
VIP会员
相关资讯
Transferring Knowledge across Learning Processes
CreateAMind
26+阅读 · 2019年5月18日
Unsupervised Learning via Meta-Learning
CreateAMind
41+阅读 · 2019年1月3日
meta learning 17年:MAML SNAIL
CreateAMind
11+阅读 · 2019年1月2日
RL 真经
CreateAMind
5+阅读 · 2018年12月28日
A Technical Overview of AI & ML in 2018 & Trends for 2019
待字闺中
16+阅读 · 2018年12月24日
disentangled-representation-papers
CreateAMind
26+阅读 · 2018年9月12日
vae 相关论文 表示学习 1
CreateAMind
12+阅读 · 2018年9月6日
Reinforcement Learning: An Introduction 2018第二版 500页
CreateAMind
11+阅读 · 2018年4月27日
【推荐】深度学习目标检测概览
机器学习研究会
10+阅读 · 2017年9月1日
Auto-Encoding GAN
CreateAMind
7+阅读 · 2017年8月4日
Top
微信扫码咨询专知VIP会员