6D robotic grasping beyond top-down bin-picking scenarios is a challenging task. Previous solutions based on 6D grasp synthesis with robot motion planning usually operate in an open-loop setting without considering perception feedback, dynamics, and contacts with objects, which makes them sensitive to grasp synthesis errors. In this work, we propose a novel method for learning closed-loop control policies for 6D robotic grasping using point clouds from an egocentric camera. We combine imitation learning and reinforcement learning in order to grasp unseen objects and handle the continuous 6D action space, where expert demonstrations are obtained from a joint motion and grasp planner. We introduce a goal-auxiliary actor-critic algorithm, which uses grasping goal prediction as an auxiliary task to facilitate policy learning. The supervision on grasping goals can be obtained from the expert planner for known objects or from hindsight goals for unknown objects. Overall, our learned closed-loop policy achieves over 90% success rates on grasping various ShapeNet objects and YCB objects in simulation. The policy also transfers well to the real world for grasping unseen objects in both a tabletop setting and a human-robot handover setting in our experiments. Our video can be found at https://sites.google.com/view/gaddpg .
翻译:6D 机器人捕捉超越自上而下从垃圾桶中挑选的情景是一项艰巨的任务。 基于 6D 捕捉与机器人运动规划相结合的先前解决方案通常在不考虑感知反馈、动态和与对象的接触的情况下在开放环环境中运作,这使得它们敏感地掌握合成错误。 在这项工作中,我们提出一种新的方法,用于学习6D 机器人捕捉的闭环控制政策,使用自我中心相机的云来捕捉点云。我们结合模仿学习和强化学习,以捕捉看不见的物体,处理连续的 6D 行动空间,在那里从联合运动和捕捉规划者那里获得专家演示。我们引入了一种目标- 辅助性行为者- critic 算法, 将抓取目标预测作为辅助任务来便利政策学习。 抓取目标的监督可以从专家规划者那里获得,也可以从不明物体的后视目标获得。 总体来说,我们所学过的闭环政策在捕捉到的各种 ShapeNet 对象和YCB 对象的模拟中成功率率超过90%。 政策还可以将我们在桌面上捕捉捉捉捉取的无形物体, 和 MAgggglebbbol 。