PPPP: 使用粒子过滤政策网络持续控制物理模拟字符 (PFPN: Continuous Control of Physically Simulated Characters using Particle Filtering Policy Network)

Data-driven methods for physics-based character control using reinforcement learning have been successfully applied to generate high-quality motions. However, existing approaches typically rely on Gaussian distributions to represent the action policy, which can prematurely commit to suboptimal actions when solving high-dimensional continuous control problems for highly-articulated characters. In this paper, to improve the learning performance of physics-based character controllers, we propose a framework that considers a particle-based action policy as a substitute for Gaussian policies. We exploit particle filtering to dynamically explore and discretize the action space, and track the posterior policy represented as a mixture distribution. The resulting policy can replace the unimodal Gaussian policy which has been the staple for character control problems, without changing the underlying model architecture of the reinforcement learning algorithm used to perform policy optimization. We demonstrate the applicability of our approach on various motion capture imitation tasks. Baselines using our particle-based policies achieve better imitation performance and speed of convergence as compared to corresponding implementations using Gaussians, and are more robust to external perturbations during character control. Related code is available at: https://motion-lab.github.io/PFPN.

翻译：利用强化学习成功地应用了基于物理的字符控制的数据驱动方法,以产生高质量的动作。然而,现有的方法通常依靠高山分布法来代表行动政策,在解决高度分辨字符的高维连续控制问题时,可能过早地承诺采取次优化的行动。在本文件中,为了改进基于物理的字符控制器的学习性能,我们提议了一个框架,考虑以粒子行动政策替代高山政策。我们利用粒子过滤法动态地探索并分解行动空间,并追踪作为混合分布法代表的后方政策。由此产生的政策可以取代作为字符控制问题主因的单式高斯政策,而不会改变用于优化政策的强化学习算法的基本模型结构。我们展示了我们对各种运动捕捉模仿任务所采用的方法的适用性。我们以粒子为基础的政策与使用高山执行法的相应执行方法相比,取得了更好的模仿性能和趋同速度,并且比字符控制期间的外扰动性更强。相关的代码可以查到: https://motion/labgiobth。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

112+阅读 · 2020年5月15日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

85+阅读 · 2020年2月18日