使用 Real2Sim2 实际自我监督学习的平板机器人投射 (Planar Robot Casting with Real2Sim2Real Self-Supervised Learning)

Manipulation of deformable objects using a single parameterized dynamic action can be useful for tasks such as fly fishing, lofting a blanket, and playing shuffleboard. Such tasks take as input a desired final state and output one parameterized open-loop dynamic robot action which produces a trajectory toward the final state. This is especially challenging for long-horizon trajectories with complex dynamics involving friction. This paper explores the task of Planar Robot Casting (PRC): where one planar motion of a robot wrist holding one end of a cable causes the other end to slide across the plane toward a desired target. PRC allows the cable to reach points beyond the robot's workspace and has applications for cable management in homes, warehouses, and factories. To efficiently learn a PRC policy for a given cable, we propose Real2Sim2Real, a self-supervised framework that automatically collects physical trajectory examples to tune parameters of a dynamics simulator using Differential Evolution, generates many simulated examples, and then learns a policy using a weighted combination of simulated and physical data. We evaluate Real2Sim2Real with three simulators, Isaac Gym-segmented, Isaac Gym-hybrid, and PyBullet, two function approximators, Gaussian Processes and Neural Networks (NNs), and three cables with differing stiffness, torsion, and friction. Results on 16 held-out test targets for each cable suggest that the NN PRC policies using Isaac Gym-segmented attain median error distance (as % of cable length) ranging from 8% to 14%, outperforming baselines and policies trained on only real or only simulated examples. Code, data, and videos are available at https://tinyurl.com/robotcast.

翻译：使用单一参数化的动态动作对变形物体进行调控, 使用单一参数化的动态动作可以用于诸如飞钓、翻翻毯子、玩摇篮板等任务。这些任务作为输入一个理想的最终状态和输出, 一个参数化的开放环动态机器人动作, 产生走向最终状态的轨迹。这对于具有与摩擦有关的复杂动态的长正旋轨轨尤其具有挑战性。本文探索了Planar Robot Casting( PRC) 的任务: 一个机器人手腕持有一条电路的平板动作, 使另一端滑行滑向一个理想的目标。 PRC 允许电缆到达机器人工作空间以外的点, 并且有用于在家、仓库和工厂的电缆管理的应用程序。为了高效学习给定的电缆的 PRC 政策, 我们提议了 Real2Sim2Real, 一个自动收集物理轨迹示例, 来调节动态模拟器的参数, 使用差异变色度, 生成了许多模拟的示例, 然后用模拟的示例来学习政策, 使用模拟和物理组合的远程数据。我们评估的是 Ral2SealSIM2, 和 Gsal- syal- sal- sal- salmentalmental- sal- sal- maildal- maild maild maild max 3 和 Gal- sal- sal- 3 sal- sal- sal- sal- salder- sal- sal- sal- sal- sal- sald- sald- sal- sal- sald- sald- sald- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- saldaldaldald- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal-