Objects rarely sit in isolation in human environments. As such, we'd like our robots to reason about how multiple objects relate to one another and how those relations may change as the robot interacts with the world. To this end, we propose a novel graph neural network framework for multi-object manipulation to predict how inter-object relations change given robot actions. Our model operates on partial-view point clouds and can reason about multiple objects dynamically interacting during the manipulation. By learning a dynamics model in a learned latent graph embedding space, our model enables multi-step planning to reach target goal relations. We show our model trained purely in simulation transfers well to the real world. Our planner enables the robot to rearrange a variable number of objects with a range of shapes and sizes using both push and pick and place skills.
翻译:物体很少在人类环境中孤立存在。因此,我们希望我们的机器人能够推理出多个对象之间的关系以及随着机器人与世界交互这些关系可能如何变化。为此,我们提出了一种新颖的基于图神经网络的多物体操作框架,用于预测由于机器人动作而改变的对象间关系。我们的模型在部分视图点云上运行,并能够动态地推理多个对象在操作期间交互的方式。通过在学习的潜在图嵌入空间中学习动力学模型,我们的模型使得多步规划能够实现达到目标关系。我们展示了我们的模型在纯仿真中训练的结果可以很好地迁移到现实世界中。我们的规划器使得机器人可以使用推和抓放技能重新排列各种形状和大小的变量数量的物体。