Training long-horizon robotic policies in complex physical environments is essential for many applications, such as robotic manipulation. However, learning a policy that can generalize to unseen tasks is challenging. In this work, we propose to achieve one-shot task generalization by decoupling plan generation and plan execution. Specifically, our method solves complex long-horizon tasks in three steps: build a paired abstract environment by simplifying geometry and physics, generate abstract trajectories, and solve the original task by an abstract-to-executable trajectory translator. In the abstract environment, complex dynamics such as physical manipulation are removed, making abstract trajectories easier to generate. However, this introduces a large domain gap between abstract trajectories and the actual executed trajectories as abstract trajectories lack low-level details and are not aligned frame-to-frame with the executed trajectory. In a manner reminiscent of language translation, our approach leverages a seq-to-seq model to overcome the large domain gap between the abstract and executable trajectories, enabling the low-level policy to follow the abstract trajectory. Experimental results on various unseen long-horizon tasks with different robot embodiments demonstrate the practicability of our methods to achieve one-shot task generalization.
翻译:在复杂的物理环境中培训长视线机器人政策对于许多应用,例如机器人操纵等,至关重要。然而,学习一种能够推广到不可见任务的政策具有挑战性。在这项工作中,我们提议通过脱钩计划生成和计划执行实现一分任务一般化。具体地说,我们的方法分三个步骤解决复杂的长视线任务:通过简化几何和物理,建立对齐的抽象环境,产生抽象的轨迹,并通过抽象到可执行的轨迹翻译解决最初的任务。在抽象环境中,消除了复杂的动态,例如物理操纵,使抽象轨迹更容易生成。然而,这在抽象轨迹与实际执行的轨迹之间引入了巨大的领域差距,即抽象轨迹与实际执行的轨迹之间缺乏低层次的细节,而且与执行的轨迹不协调框架。在语言翻译方面,我们的方法利用后向正等模型,以克服抽象和可执行的轨迹之间的巨大领域差距,使得抽象轨迹轨迹更容易产生抽象的轨迹,使低轨道政策与实际执行的轨迹之间能够实现一个普通的轨道。