RLOC: 利用强化学习和最佳控制,利用强化学习和最佳控制,进行地-地-Aware Legged Loccomotion (RLOC: Terrain-Aware Legged Locomotion using Reinforcement Learning and Optimal Control)

We present a unified model-based and data-driven approach for quadrupedal planning and control to achieve dynamic locomotion over uneven terrain. We utilize on-board proprioceptive and exteroceptive feedback to map sensory information and desired base velocity commands into footstep plans using a reinforcement learning (RL) policy. This RL policy is trained in simulation over a wide range of procedurally generated terrains. When ran online, the system tracks the generated footstep plans using a model-based motion controller. We evaluate the robustness of our method over a wide variety of complex terrains. It exhibits behaviors which prioritize stability over aggressive locomotion. Additionally, we introduce two ancillary RL policies for corrective whole-body motion tracking and recovery control. These policies account for changes in physical parameters and external perturbations. We train and evaluate our framework on a complex quadrupedal system, ANYmal version B, and demonstrate transferability to a larger and heavier robot, ANYmal C, without requiring retraining.

翻译：我们提出一种统一的模型和数据驱动方法,用于四重规划和控制,以便在不均匀的地形上实现动态移动。我们利用机载自动和外向反馈,利用强化学习(RL)政策将感官信息和所需的基本速度指令映射成脚步计划。这一RL政策在程序产生的广泛地形上进行模拟培训。在线运行时,系统使用基于模型的运动控制器跟踪生成的脚步计划。我们评估了我们的方法在各种复杂地形上的稳健性。它展示了将稳定性置于攻击性移动之上的行为。此外,我们引入了两种辅助RL政策,用于整体运动跟踪和复原控制。这些是物理参数和外部扰动变化的账户。我们培训和评估了复杂的四重系统Anymal B的框架,并演示了在不需要再培训的情况下向较大和较重的机器人Anymal C的可转移性。